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ABSTRACT 

We present our results on the structure and activity of massive galaxies at 2: = 1 — 3 using one of 
the largest (166 with M^ > 5 x 10^^ ^0) and most diverse samples of massive galaxies derived from 
the GOODS-NICMOS survey: (1) Sersic fits to deep NIC3/F160W images indicate that the rest-frame 
optical structures of massive galaxies are very different at 2; = 2 — 3 compared to 2; ~ 0. Approximately 
40% of massive galaxies are ultra-compact (rg < 2 kpc), compared to less than 1% at z ~ 0. Furthermore, 
most {^ 65%) systems at z = 2 — 3 have a low Sersic index n < 2, compared to ~ 13% at z ~ 0. We 
present evidence that the n < 2 systems at z = 2 — 3 likely contain prominent disks, unlike most massive 
z ^ systems. (2) There is a correlation between structure and star formation rates (SFR). The majority 
(~ 85%) of non-AGN massive galaxies at z = 2 — 3, with SFR high enough to yield a 5<j (30/iJy) 24 
/im Spitzer detection have low n < 2. Such n < 2 systems host the highest SFR. (3) The frequency of 
AGN is ^ 40% at z = 2 - 3. Most {^ 65%) AGN hosts have disky (n < 2) morphologies. Ultra-compact 
galaxies appear quiescent in terms of both AGN activity and star formation. (4) Large stellar surface 
densities imply massive galaxies at 2: = 2 — 3 formed via rapid, highly dissipative events at 2; > 2. The 
large fraction of n < 2 disky systems suggests cold mode accretion complements gas-rich major mergers 
at 2: > 2. In order for massive galaxies at z = 2 — 3 to evolve into present-day massive E/SOs, they 
need to significantly increase (n, rg). Dry minor and major mergers may play an important role in this 
process. 

Subject headings: galaxies: bulges — galaxies: evolution — galaxies: formation — galaxies: 
fundamental parameters — galaxies: interactions — galaxies: structure 



1. INTRODUCTION 

Studies of high-redshift galaxies are essential for testing 
and constraining models of galaxy formation. Conven- 
tional wisdom suggests galaxies are assembled and shaped 
by a combination of mergers, smooth accretion, and in- 
ternal secular evolution. Galaxies form inside cold dark 
matter halos that grow hierarchically through mergers 
with other halos and gas accretion (Somerville & Pri- 
mack(1999); Cole et al. 2000; Steinmetz & Navarro 2002; 
Birnboim & Dekel 2003; Keres et al. 2005; Dekel & Birn- 
boim 2006; Dekel et al. 2009a; Dekel et al. 2009b; Keres 
et al. 2005; Keres et al. 2009; Brooks et al. 2009; Ceverino 
et al. 2010), while internal secular evolution (Kormendy & 
Kennicutt, 2004; Jogee et al. 2005) redistributes accreted 
material. Within the paradigm of hierarchical assembly, a 
number of issues remain. It is not known when and how 
the main baryonic components of modern galaxies (bulges, 
disks, and bars) formed, but the global stellar mass density 
rose substantially between z ~ 1 — 3, reaching ~ 50% of 
its present value by 2; ^ 1 (Dickinson et al. 2003b; Drory 



et al. 2005; Conselice et al. 2007; Eisner et al. 2008; 
Perez-Gonzalez et al. 2008). 

It is also not clear how high-redshift galaxies evolve into 
present-day galaxies. Complex baryonic physics such as 
mergers, gas dissipation, and feedback are all at work to 
an extent. There is also mounting evidence that cold- mode 
accretion (Birnboim & Dekel 2003; Keres et al. 2005; 
Dekel & Birnboim 2006; Dekel et al. 2009a; Dekel et al. 
2009b; Keres et al. 2005; Keres et al. 2009; Brooks et al. 
2009; Ceverino et al. 2010) is important for building star- 
forming galaxies. This process is particularly effective in 
galaxies with halos of mass below 10^^ Mq such that cold- 
mode accretion dominates the global growth of galaxies at 
high redshifts and the growth of lower mass objects at late 
times. 

High-redshift galaxies are different from local galaxies. 
Within the framework of hierarchical assembly, early, high- 
redshift galaxies are expected to be smaller, at a given 
mass, than their present-day counterparts. The size dif- 
ference is predicted to be a factor of a few at 2: = 2 — 3 
(Loeb & Peebles 2003; Robertson et al. 2006; Khochfar & 
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Silk 2006; Naab et al. 2007). Several recent studies using 
rest-frame optical data provide evidence for size evolution 
among massive galaxies (Guzman et al. 1997; Daddi et 
al. 2005; Trujillo et al. 2006, 2007; Zirm et al. 2007; Toft 
et al. 2007; Longhetti et al. 2007; Cimatti et al. 2008; 
Buitrago et al. 2008; van Dokkum et al. 2008, 2010; van 
der Wei et al. 2011). Aside from size evolution, there is 
some evidence that the nature of red galaxies changes at 
higher redshift. At z < 1, the red sequence primarily con- 
sists of old, passively evolving galaxies (Bell et al. 2004). 
Among extremely red galaxies (EROs) at z = 1 — 2, less 
than 40% are morphologically early types (Yan & Thomp- 
son 2003; Moustakas et al. 2004). It is well known that 
star formation rates were more intense at higher redshift 
(Daddi et al. 2007; Drory & Alvarez 2008), and a hnk has 
been found between star formation, size, and morphology 
at z - 2.5. Toft et al. (2007) and Zirm et al. (2007) find 
from NICMOS rest-frame optical imaging that blue star- 
forming galaxies are significantly more extended than red 
quiescent galaxies. Additionally, examples of rapidly star- 
forming galaxies (SFR - 50 - 200 Mq yr-^) at z - 2 - 3, 
whose ionized gas kinematics are consistent with turbu- 
lent rotating disks, are found in the SINS survey (Forster 
Schreiber et al. 2009; Genzel et al. 2008; Shapiro et al. 
2008). 

Progress on understanding the evolution of massive 
galaxies at high redshift has been hindered by significant 
observational challenges. The deep optical surveys car- 
ried out by HST ACS^ such as the Hubble Ultra Deep 
Field (HUDF, Beckwith et al. 2006) and the Great Obser- 
vatories Origins Deep Survey (GOODS, Giavalisco et al. 
2004), trace rest-frame optical galaxy morphology only out 
to z ^ 1. At higher z, bandpass shifting effects cause fil- 
ters to trace progressively bluer bands, and optical filters 
trace rest-frame UV at z > 2. UV light traces massive 
young stars, but manages to set few constraints about the 
overall mass distribution, making it difficult to probe the 
structure and mass of galaxy components at early epochs. 

Without high-resolution, deep, rest-frame optical imag- 
ing, it is not possible to robustly compare structural pa- 
rameters in galaxies across redshift. NIK imaging is re- 
quired to probe the rest-frame optical at z ~ 1 — 3. Un- 
fortunately, deep NIR imaging with HST has been com- 
pleted for a limited number of galaxies over relatively small 
fields and small volumes at 2: > 1, with most pointings be- 
ing within the Hubble Deep Fields and the Hubble Ultra 
Deep Field due to the inefficiency of the NICMOS cam- 
era in covering large areas (e.g., Dickinson et al. 2004; 
Thompson et al. 2005; Zirm et al. 2007; van Dokkum 
et al. 2008). While ground-based NIR imaging surveys 
(e.g., Kajisawa et al. 2006; Retzlaff et al. 2010) efficiently 
cover wide fields at resolutions almost comparable to HST 
NICMOS, the depths reached are at least an order of mag- 
nitude shallower. 

A large area, high-resolution, deep, space-based NIR 
survey would be bountiful for galaxy formation stud- 
ies. The GOODS-NICMOS Survey (GNS; Conselice et 
al. 2011), covering 44 arcmin^ of the GOODS fields with 
NIC3, is a strong first effort. The GOODS-North and 
GOODS-South are among the best-studied regions in the 
sky and are a natural choice for such a survey. The 
GOODS fields already have deep data from HST ACS 



(Giavalisco et al. 2004), Spitzer IRAC/MIPS (Dickinson 
et al. 2003a), and Chandra (Giacconi et al. 2002; Alexan- 
der et al. 2003; Lehmer et al. 2005; Luo et al. 2008), 
among others. GNS consists of 60 pointings centered on 
massive (M^ > 10^^ ^0) galaxies at 2: > 2, observed to 
a depth of H = 26.8 magnitudes. The value of GNS lies 
in the fact that the target fields were optimized to include 
massive galaxies selected by multiple methods in order to 
create an unbiased sample (see Conselice et al. 2011). 
There are additional massive galaxies in each field beyond 
the 60 main targets, so that there are 82 galaxies with 
M^ > 10^^ Mq at z = 1 — 3 across all pointings. Thus, 
the GNS data contain one of the largest samples of very 
massive galaxies at high redshift with rest-frame optical 
imaging, and they robustly probe massive galaxies when 
the Universe was less than 1/3 of its current age, during 
the epoch of bulge and disk formation. 

The goal of this work is to investigate the evolution of 
massive galaxies over z = 1 — 3 with this unique sam- 
ple. We take advantage of the existing rich ancillary data 
to derive star formation rates (SFR) from 24 jam detec- 
tions and look for AGN activity based on X-ray detec- 
tions and mid-IR SEDs. We correlate rest-frame optical 
structural parameters with SFR to gain insight into how 
massive galaxies are expected to evolve. 

The plan of this paper is as follows. We discuss the data 
and sample properties in ^ In ^we describe the mea- 
surement of structural parameters, and in ^3.21 we make 
a detailed comparison with z ^ galaxies of similar stel- 
lar mass. A detailed artificial redshifting experiment is 
conducted in ^3.3.11 to explore the impact of instrumen- 
tal and redshift-dependent effects on structural parame- 
ters. In SI we measure star formation properties based 
on Spitzer MIPS 24 jam detections and discuss how they 
relate to structural properties. Estimates of the mass and 
fraction of cold gas in massive star-forming galaxies at 
z = 2 — 3 are presented in ^ In ^ we use a variety 
of techniques (X-ray properties, IR power-law, and IR-to- 
optical excess) to identify AGN and consider how galaxy 
activity relates to galaxy structure. Finally, in ^ and ^ 
we discuss and summarize our results. All calculations as- 
sume a flat ACDM cosmology with Q^ = 0.7 and Hq = 70 
km s~^ Mpc~^. 

2. DATA AND SAMPLE 

2.1. Observations and Pointing Selections for GNS 

Our data comes from the GOODS-NICMOS Survey 
(GNS; Conselice et al. 2011). GNS is a deep, 180-orbit 
survey with the HST NICMOS-3 camera in the F160W 
(H) band that probes optical light from galaxies between 
z ^ 1 — 3. The coverage extends over both ACS GOODS 
fields and is divided between 60 pointings centered on mas- 
sive M^ > 10^^ M0 galaxies at 2: > 2. Each pointing covers 
5IV2 X 5IV2 and was observed to a depth of three orbits 
in nine exposures of ~ 900 seconds (~ 135 minutes per 
pointing). A total of ^ 8300 sources were detected across 
an effective area of ~ 44 arcmin^. The 5cr limiting magni- 
tude for an extended source with a ff!7 diameter is H=26.8 
(AB). The NIC-3 images were drizzled with a pixfrac of 
0.7 and output platescale of OVl. The NIC3 camera is 
currently out of focus, and after detailed investigation (see 
^X|), we find the point spread function (PSF) spans a full 



width half maximum (FWHM) of 0'/26-0'/36 with a mean 
value of 0^/3. 

The 60 GNS pointings were planned by identifying mas- 
sive galaxies having a photometric redshift of 1.5 < z < 2.9 
and stellar mass M^ > 10^^ Mq via three color selection 
criteria. The target galaxies include Distant Red Galax- 
ies (DRGs, Papovich et al. 2006), Extremely Red Ob- 
jects (EROs, Yan et al. 2004), and BzK-selected galaxies 
(Daddi et al. 2004). All of these methods are designed to 
select red dusty or red passively evolving galaxies. DRGs 
have evolved stellar populations that are identified with 
J — K > 2.3 (Vega mag). EROs are selected based on 
Spitzer and NIR data via /^(3.6/im)//^(z850) > 20. This 
selection is sensitive to red populations that are either old 
or reddened, so EROs contain a mixture of young and old 
stellar populations. BzK galaxies are selected based on 
the quantity BzK = (z — K)ab — {B — z)ab- Galaxies 
with BzK > —0.2 at z > 1.4 are identified as star- forming 
galaxies. Redder and possibly more evolved galaxies are 
identified with BzK < -0.2 and {z - K)ab > 2.5. The 
final pointings were designed to include at least one red 
massive galaxy and to also maximize the total number of 
additional galaxies (e.g., Lyman-break galaxies and sub- 
mm galaxies) within each pointing. 

2.2. Our Sample of Massive Galaxies at z=l-3 

The sample of massive galaxies that we work with in 
this paper is not limited to the original color-selected mas- 
sive galaxies at z > 1.5 defining the original 60 GNS 
pointings. Instead, our sample of massive galaxies at 
z = 1 — 3 is derived from the set of all galaxies mapped 
with NIC3/F160W across the 60 fields, and for which a 
reliable stellar mass and photometric redshift was esti- 
mated by Conselice et al. (2011), based on SED fits to the 
NIC3/F160W and optical imaging. A detailed description 
of how these quantities were estimated is in Conselice et 
al. (2011), and we only briefiy summarize the methodol- 
ogy here. 

The source extraction catalog for the NICMOS images 
across the 60 pointings of the GNS survey contains ^ 8300 
sources with H < 28 and V < 30. For those galaxies de- 
tected in the ACS BViz and NICMOS H bands, we use 
the available photometric redshifts and stellar masses from 
Conselice et al. (2011). Photometric redshifts were deter- 
mined by fitting template spectra to the BVizH data. 
Stellar masses were measured by fitting the BVizH mag- 
nitudes to a grid of SEDs generated from Bruzual & Char- 
lot (2003) stellar population synthesis models, assuming a 
Salpeter IMf^J- The grid includes different colors, ages 
of stellar populations, metallicities, dust content, and star 
formation histories as characterized by exponentially de- 
clining models. In general, the stellar masses derived de- 
pend on the SED used and the assumptions used in the 
SED modeling, such as the IMF, the metallicity, the ex- 
tinction law, and star formation history (e.g., Borch et al. 
2006; Marchesini et al. 2009; Conselice et al. 2011). The 
typical uncertainty in stellar mass across the sample is a 
factor of '^ 2 — 3. 

In order to account for a small number (15) of additional 
massive (M^ > 5 x 10^^ ^o) red systems, which are un- 

^^ In ^we use a Chabrier IMF for SFR estimates. Using a Chabrier IMF rather than a Salpeter IMF in estimating the stellar mass would 
lower the values by a factor of 0.25 dex or less. 



detected in the GOODS ACS BV and therefore do not 
have viable stellar masses from the above techniques, we 
use available masses and redshifts (Buitrago et al. 2008; 
Bluck et al. 2009) based on deep ground-based RIJHK 
data along with ACS iz data, where available. Photomet- 
ric redshifts are determined with a mixture of techniques 
(e.g., neural networks and Bayesian techniques) described 
more fully in Conselice et al. (2007). Stellar masses were 
measured from these data with uncertainties of a factor 
of ~ 2 — 3 with the multi-color stellar population fitting 
techniques from Conselice et al. (2007, 2008). As with 
the larger sample described above, a stellar mass is pro- 
duced by fitting model SEDs to the observed SED for each 
galaxy. A Salpeter IMF is assumed, and the SED grids are 
constructed from Bruzual & Chariot (2003) stellar popu- 
lation synthesis models. 

From the sample of galaxies with photometric redshifts 
and stellar masses determined as described above, we de- 
fine the sample of massive galaxies used in this paper. We 
restrict our analysis to the redshift interval z = 1 — 3 
over which our NIC3/F160W images probe the rest-frame 
optical light in order to avoid bandpass shifts into the rest- 
frame UV. This ensures that we measure all structural pa- 
rameters in the rest-frame optical across z = 1 — 3, thereby 
reducing bandshift biases (see ^3.11 for a quantitative esti- 
mate). Although the mass functions calculated for GNS 
by Mortlock et al. (2011) show that the mass complete- 
ness limit is ~ 3 X 10^ Mq at 2; ~ 3, we apply a higher 
mass cut of 5 x 10^^ Mq as our interest is specifically with 
the most massive galaxies. 

Our final sample consists of the 166 (82) massive galax- 
ies with M^ > 5 X 10^° Mq (M^ > 1 x 10^^ Mq) and 
z = 1 — 3. This is the largest H ST -hdised dataset with rest- 
frame optical imaging of massive galaxies over 2: = 1 — 3. 
The galaxies with M^ > 10^^ Mq from Buitrago et al. 
(2008) are part of the sample. The other previous HST 
NICMOS studies (e.g.. Toft et al. 2007; Zirm et al. 2007; 
van Dokkum et al. 2008) each contain, at most, 10 — 20 
systems with M^ > 10^^ Mq. The full distributions of 
apparent H and V magnitude, stellar mass, and redshift 
for this sample are shown in Figure [TJ 

Figure [2] shows a comparison of the galaxy stellar mass 
function (SMF) of our GNS-based sample to the published 
SMF of other NIR-selected samples in the literature, such 
as the K-selected samples of Fontana et al. (2006), Ka- 
jisawa et al. (2009) and Marchesini et al. (2009), as 
well as the IRAC-selected sample of Perez-Gonzalez et al. 
(2008). This figure essentially shows that for the mass 
range (M^ > 5 x 10^^ Mq) relevant for the GNS-based 
sample used in our paper, there is good agreement be- 
tween the SMF of our sample and those from these four 
studies. In particular, at M^ > 5 x 10^^ M©, the top 
panel shows that there is very good agreement with our 
sample, Fontana et al. (2006), and Perez-Gonzalez et al. 
(2008) for three different redshift bins between z = 1.5 and 
z = 3.0. In the lower panel, at M^ > 5 x 10^° M©, the 
average SMF from Kajisawa et al. (2009) agrees with that 
of our sample within a factor of ~ 2 over 1.5 < z < 2.5. 
The SMF from our GNS-based sample and Marchesini et 
al. (2009) show good agreement at z = 2 — 3, and are 



slightly offset at z=1.3 to 2.0. The small offset may not 
be statistically significant if one includes all the sources of 
error. The error bars on the GNS mass functions include 
Poisson errors only. Marchesini et al. (2009) show that the 
dominant sources of error regarding stellar mass functions 
are cosmic variance and systematics from the assumptions 
used in the SED modeling. For a discussion of the SMF 
for lower mass (M^ > 5 x 10^^ ^o) galaxies, which are 
not included in the sample used in this paper, we refer the 
reader to Mortlock et al. (2011). 

In our sample of 166, massive galaxies, spectroscopic 
redshifts are available for 44 galaxies (26.5 ± 3.4% of the 
sample). These 44 galaxies are all bright with V < 27 and 
J^AB < 23. Among these 44 galaxies, the median photo- 
metric redshift error is dz/{l + 2:) = 0.071 (Griitzbauch et 
al. 2010), 7/44 (15.9 ± 5.5%) have 5z/{l + z) > 0.2, and 
none have 5z/{l + 2;) > 0.513 For the remaining 122/166 
(73.5 ±3.4%) of our sample galaxies without spectroscopic 
redshifts, photometric redshifts are used. Among these 122 
galaxies, 60 (49.2 ±4.5%) are fainter than V > 27, and the 
uncertainties in photometric redshifts may be larger than 
the median value of 0.071 cited above. 

2.3. Properties and Selection Biases in the Sample 

We estimate the number density of massive (M^ > 
5 X 10^° Mq) galaxies over z = 2-3tobe-5x 10""^ 
Mpc~^ (see Conselice et al. 2011 for a detailed discus- 
sion of the number density of massive galaxies in the 
GNS sample). The corresponding stellar mass density is 
^ 6 X 10'^ Mq Mpc~^. The massive GNS galaxies are col- 
lectively 10-100 times more abundant than SMGs, which 
have space densities of 10~^ — 10~^ Mpc~^ at z ~ 2 — 3 
(Blain et al. 2002). Rather, the number density is in 
agreement with published values (Daddi et al. 2005; 2007) 
for other passively evolving and star-forming galaxies at 
z - 2. 

How does our sample break down in terms of the typical 
color-selection methods, which are usually used to identify 
massive high redshift galaxies? About 63% (104/166) of 
this final sample is listed in existing catalogs for DRG 
(Papovich et al. 2006), BzK (Daddi et al. 2004), or 
ERO (Yan et al. 2004) galaxy populations. There are 
8, 9, and 43 sources that are uniquely listed in one of the 
DRG, Bzk^ or ERO galaxy catalogs, respectively. An ad- 
ditional 44 sources are listed in two or more of these cat- 
alogs. About 37% (62/166) sources were not previously 
identified as DRG, ERO, or BzK galaxies. 

What are the selection biases impacting our sample? 
General biases in the selection of massive galaxies in the 
GNS survey have been discussed in Conselice et al. (2011), 
and we only discuss below the points relevant for our sam- 
ple. 

The 60 GNS pointings were selected to include massive 
galaxies identified via three color methods (DRG, BzK, 
and lERO). Combining all three color criteria, rather than 
using any single one, is already a step forward compared 
to many earlier studies because no single criterion would 
isolate a complete sample of massive galaxies (e.g., van 
Dokkum et al. 2006; Conselice et al. 2011). These three 

^^ While figure 6 of Conselice et al. (2011) shows that ~ 15 — 20% of bright (20 < i^AB < 23) galaxies with spectroscopic redshifts are 
catastrophic outliers in photometric redshift with Sz/{1 -\- z) > 0.5, it should be noted that there are no catastrophic outliers with such large 
Sz/{1 -\- z) > 0.5 among the 44 galaxies with spectroscopic redshifts in our sample of massive (M^ > 5 x 10-^^ Mq) galaxies at z = 1 — 3. The 
outliers with Sz/{1 -\- z) > 0.5 in the GNS survey have stellar masses below the cutoff value of our sample or/and lie outside its redshift range. 



criteria all pick massive galaxies with red observed colors, 
but due to the range of criteria involved, they can pick 
both red dusty systems and red evolved stellar popula- 
tions. 

Another key step that makes our study less biased to- 
wards a specific type of massive galaxy is that our work- 
ing sample at z = 1 — 3 is neither limited to nor defined 
by the original color-selected massive galaxies. Rather, it 
is derived from all galaxies within the survey area that 
are bright enough to be mapped with NIC3/F160W and 
for which a reliable stellar mass and photometric redshift 
could be determined by Conselice et al. (2011), as out- 
lined in ^2.2[ The first potential bias in this final sample 
is introduced by excluding galaxies that are undetected 
by NIC3/F160W. The second potential bias is introduced 
by excluding detected galaxies for which no reliable stel- 
lar mass and photometric redshift could be determined. 
For instance, ultra-dusty galaxies, may not be detected in 
enough of the optical bands to allow a photometric redshift 
to be reliably estimated. 

We assess the impact of the second bias by estimating 
how many massive galaxies we might miss due to the lack 
of available photometric redshift and stellar masses. Of the 
8300 sources detected by GNS, 1076 have no photometric 
redshift and stellar mass measurements. Most (68%) of 
these 1076 sources are fainter {H > 25) than our sample of 
massive galaxies (Figure[T]). Among GNS objects as bright 
{H < 25) as our sample of massive galaxies, only 8.5%, or 
349/4083 have no redshift or stellar mass measurements. 
Furthermore, not all 8.5% of these bright {H < 25) sources 
will be massive, so that this fraction represents an upper 
limit on the sources we might not include in our sample 
due to the lack of a photometric redshift or stellar mass 
measurements. 

We next discuss the impact of the first potential bias 
and the type of objects the GNS survey might not detect. 
It is relevant to ask whether we might miss galaxies with 
blue observed colors. We believe this is not the case for 
the following reasons. As discussed above, our working 
sample is not strongly biased against galaxies with blue 
observed colors because it is not limited to those massive 
galaxies selected by the three color methods (DRG, BzK, 
and lERO) that preferentially pick galaxies with red ob- 
served colors. Secondly, Conselice et al. (2011) explic- 
itly show that many galaxies with blue observed {z — H) 
colors, which would have been undetected by these color 
selections, do get included in this final sample of massive 
galaxies for the GNS survey. Nearly all known Lyman 
Break Galaxies or BX/BM objects (Reddy et al. 2008) 
at 2; = 2 — 3 in the GNS fields are detected by the GNS 
NIC3/F160W imaging (Conselice et al. 2011). 

In terms of rest-frame colors, rather than observed col- 
ors, it is also important to note that the galaxies detected 
by GNS atz = 1 — 2orz = 2 — 3 include systems with both 
blue and red rest-frame U — V colors. The rest-frame U — V 
color ranges from about —0.4-2.1 for galaxies in the stel- 
lar mass range M^ ^ 10^ - 10^^ Mq (Figure [3]). The sys- 
tems with blue rest-frame U — V colors are preferentially 
at low masses, while GNS galaxies with M^ > 1 x 10^^ 



Mq at z = 2 — 3 have preferentially red rest-frame U — V 
colors, in the range of 1.0 to 1.7. These inherently red rest- 
frame U — V colors of the massive galaxies at z = 2 — 3 
could be due to a combination of old stellar populations 
and dusty young star- forming regions. We checked that 
the colors are consistent with stellar population synthesis 
models (based on Bruzual & Chariot (2003) and assuming 
a Chabrier IMF, an exponentially declining star formation 
history with a 100 Myr e- folding time). We find that even 
without dust extinction U — V color rises rapidly. Assum- 
ing solar metallicity, U — V is already ~ 1 at an age of 0.5 
Gyr and reaches ~ 1.6 at 2 Gyr. For the case with dust 
extinction and an optical depth ofl, [/ — Fis^l.l after 
0.5 Gyr and - 1.8 after 2 Gyr. 

3. STRUCTURAL PROPERTIES OF MASSIVE GALAXIES 

3.1. Structural Decomposition 

We characterize the massive GNS galaxies with struc- 
tural decomposition. Ideally, one would like to fit multiple 
components (bulge, disk, bar, nuclear point source, etc.) 
in the decomposition, but the 0^3 diameter (or full width 
half-maximum) of the PSF (corresponding to ~ 2.4 kpc at 
2; = 1 — 3) prevents such detailed decomposition^^ In- 
stead, we choose to fit the 2D light distributions with only 
single Sersic (1968) r^^^ profiles, which have the form 



I(r) = le exp -bn 



l/n 



1 



(1) 



where le is the surface brightness at the effective radius 
Tg and bn is a constant that depends on Sersic index n. 
Knowledge of the PSF is important for deriving structural 
parameters. We model the PSF (Appendix [X]) while tak- 
ing into account both the variation in PSF with position 
on the NIC3 field and the dependence on the drizzle al- 
gorithm. We find a range in PSF FWHM of - 0^26 -0^36. 

It is clear that a single Sersic profile is not a complete 
indicator of overall galaxy structure. For instance, in de- 
tailed images of nearby galaxies, the best-fit index n for 
a single Sersic profile does not always correlate with the 
bulge Sersic index obtained with 2D bulge-disk or bulge- 
disk-bar decomposition (Weinzirl et al. 2009). However, 
the single Sersic index n is on average a good way to sep- 
arate disk-dominated galaxies from the class of luminous 
spheroidal and bulged-dominated galaxies (see ^3.3.ip . and 
in studies of high-redshift galaxies the criterion n < 2 is 
often used to separate spirals or disk galaxies from ellip- 
ticals (e.g., Ravindranath et al. 2004; Beh et al. 2004; 
Jogee et al. 2004; Barden et al. 2005; Trujillo et al. 2007; 
Buitrago et al. 2008). 

The NIC3/F160W images of the 166 sample galaxies 
were fit with a single Sersic component using GALFIT 
(Peng et al. 2002). In each image, objects that were near, 
but not blended with, the primary source were masked 
out. For the fraction (~ 15%) of the primary galaxies that 
were blended or overlapping with another galaxy identified 
in the source extraction catalog, the blended sources were 
each fitted simultaneously with a separate Sersic profile. 
Some fraction of primary galaxies appeared morphologi- 
cally disturbed (~ 8%, see Figure [H and ^3.2p . but these 

^'^ For the m ore extended galaxies multiple components (e.g., bulge 
discussed in N7.1I 



were fitted with only a single Sersic profile as they only 
counted as a single galaxy in the source extraction cata- 
log. 

Bandpass shifting causes the i^-band central wavelength 
to move from 4000-8000 A over z = 1-3. The z = 1-2 and 
z = 2 — 3 bins used in Figure [5l for example, correspond 
to 5333-8000 A (/-band) and 4000-5333 A (5-band), re- 
spectively. Even with the bandpass shifting, comparing 
the structural parameters (n, rg) measured in these two 
bands to each other and to parameters of 2; ~ galax- 
ies measured in rest-frame 5 is a vast improvement over 
previous studies forced to compare the rest-frame UV at 
z > 1 io the rest-frame optical at z < 1. The systematic 
effects resulting from i/-band changing from B to /-band 
over 2; = 1 — 3 are small, as can be inferred from stud- 
ies of nearby galaxies. Graham (2001) presents bulge-disk 
decompositions of local z ~ galaxies based on images in 
the B and / bands. The median ratio in 5-band//-band 
disk scalelength is 1.13, so that the disks are measured to 
be slightly larger in the B-band. If similar errors apply 
here, then the bias rg due to bandpass shifting is on the 
order of 10%. 

Another important consideration is the effect of poten- 
tial AGN on the structural fits. When fitting high res- 
olution images of nearby galaxies, it is well known that 
fitting a galaxy that hosts a point source with a single 
Sersic component will lead to an artificially high Sersic in- 
dex n (typically n > 4; e.g., Weinzirl et al. 2009; Pierce 
et al. 2010). If a point source is added to the Sersic 
model, the index n of the Sersic component falls to more 
reasonable values. In the case of the massive GNS galax- 
ies at z = 1 — 3, we expect that the low resolution (0'.'3, 
corresponding to 2.5 kpc at z - 2) of the NIC3/F160W 
images will reduce the effect of potential point sources on 
the structural decomposition. However, for completeness, 
we have fitted all the galaxies at z = 1 — 3 in which a po- 
tential AGN was identified via a variety of techniques (^ 
with both a Sersic component and a point source. The 
fractional luminosity of the point source components, or 
PSF/Total light ratio, ranges from 1-46%, with a median 
of 10%. As expected, including the point source produces 
generally small changes in (n, rg) and goes in the direction 
of lowering n and enlarging rg. Overall, our results are not 
biased by the presence of AGN. In the rest of the paper, 
we therefore choose to use the structural parameters for a 
single Sersic component fit. 

3.2. Derived Structural Properties at z = 2 — 3 

The results of the structural fits to the NIC3/F160W 
images of the 166 sample galaxies are shown in Table [TJ 
Figure m Figure [5l and Figure [6l 

Figure |4] shows examples of massive (M^ > 5 x 10^^ 
Mq) galaxies at z = 2 — 3 with different ranges of Sersic 
index n and effective radius rg. The majority (~ 82%; 
Table [1]) of the massive GNS galaxies at 2: = 2 — 3 have 
Tg < 4 kpc. In such systems, structural features are gen- 
erally hard to discern due to resolution effects, so that 
systems appear fairly featureless (top 4 rows of Figure |4]) . 
In the small fraction of massive galaxies at 2; = 2 — 3 
with large rg > 4 kpc, one can discern some structural 

and disk) decomposition was attempted with limited success and this is 



features such as an elongated bar-like feature or a com- 
bination of a central condensation surrounded by a more 
extended lower surface brightness component, reminiscent 
of a bulge and disk (5th row). Row 6 contains morpholog- 
ically disturbed systems. The fraction of such systems is 
small, only ~ 8%, but this is a lower limit given redshift- 
dependent effects such as degraded physical resolution and 
surface brightness dimming. 

The lower two rows of Figure [5] shows the rest-frame 
optical Sersic index n and effective radius rg for the sam- 
ples of massive galaxies at z = 1 — 2 and z = 2 — 3. For 
comparison, the top row of Figure [5] also shows the rest- 
frame optical structural parameters for z ~ galaxies of 
similar stellar mass taken from Allen et al. (2006), who 
performed a single component Sersic fit to 5-band images 
of galaxies in the Millennium Galaxy Catalogue (MGC), a 
large ground-based imaging and spectroscopic survey over 
37.5 deg^ (Liske et al. 2003; Driver et al. 2005). It is 
clear from Figure O Figure [6l and Table [1] that the mas- 
sive galaxies at z = 2 — 3 are strikingly offset toward lower 
(n, Tg) compared to the massive ~ galaxies. 

Firstly, we find that the majority (154.9 ± 5.4% for 
M^ > 5 X 10^° Mq, and 58.5 ± 7.7% for M^ > 10^^ Mq) 
of massive galaxies at z = 2 — 3 have low n < 2, while the 
fraction at z ^ is five times lower. We will present evi- 
dence in ^7.1l that most of the massive systems with a low 
n < 2 harbor a massive disk component, so that our re- 
sults point to the predominance of disk- dominated systems 
among massive galaxies at z = 2 — 3. 

Secondly, we also find that massive galaxies at z = 2 — 3 
typically have smaller rg than massive galaxies at z ~ 0. 
In particular, - 40% f39.0 ± 5.6% for M^ > 5 x 10^° M© 
and39.0±7.6%forM^ > 1 x 10^^ Mq ) of massive galax- 
ies at z = 2— 3 are ultra-compact (Vg < 2 kpc), compared to 
less than one percent at z ^ 0. The massive ultra-compact 
(rg < 2 kpc), galaxies at z = 2 — 3 have few counterparts 
among z ^ massive galaxies. 

The population of galaxies with low n < 2 and the popu- 
lation of ultra-compact (rg < 2 kpc) galaxies show limited 
overlap. Only 28.0 ± 6.4% of the systems with low n < 2 
are ultra-compact and the remaining majority (72.0±6.3% 
for M^ > 5 X 10^0 Mo, and 75.0±8.8% for M^ > 10^^ Mq) 
are extended (rg > 2 kpc). Conversely, among the ultra- 
compact (rg < 2 kpc) systems, nearly half (46.7 ±9.1% for 
M^ > 5 X 10^0 Mo, and 37.5 ± 12.1% for M^ > 10^^ M©) 
have low n < 2. 

Figure [3 further illustrates the striking difference be- 
tween massive galaxies at z = 2 — 3 and z ^ hy com- 
paring their effective radius rg and their mean rest-frame 
optical surface brightness < /ig > within rg. The value of 
< /ig > was measured from the extinction-corrected rest- 
frame 5-band light within rg and is defined as: 

/ie = Bcorr + 2.51ogio (2^r^) - 101ogio(l + z) (2) 
where ^corr is the extinction-corrected, rest-frame appar- 
ent B magnitude and — 101og;Lo(l + ^) ^^^ i^ ^^^ correc- 
tion for surface brightness dimming. The MGC galaxies 
at z ~ are corrected only for Galactic extinction, while 
for the GNS galaxies the correction includes Galactic and 
internal extinction. The mean rest-frame optical surface 
brightness can be 2.0 to 6.0 magnitudes brighter for the 
massive galaxies at z = 2 — 3 than for 2: ~ massive 
galaxies. This is due to their smaller sizes and likely 



differences in the age of the stellar populations. The high 
mean rest-frame optical surface brightness of the massive 
galaxies at 2; = 2 — 3 translates into high mean stellar 
mass densities, and suggests that highly dissipative events 
played an important role in their formation (see ©. 

It is worth noting that the use of deeper images for the 
z ~ galaxies could make the large offset in (n, rg) at 
z = 2 — 3 versus z ~ even stronger. The MGC 5-band 
images have a median sky background of 22 mag/arcsec^. 
Low surface brightness halos may be detected around some 
of the z ~ galaxies in deeper exposures. This is true for 
some massive elliptical and cD galaxies, and in these cases 
the (n, rg) are significantly boosted if the halo is region is 
also fitted (Kormendy et al. 2009). 

How do these results compare with earlier studies? 
While many of the earlier studies focused on small sam- 
ples, this work is a step forward because of the improved 
number statistics that come with an unbiased and com- 
plete sample of massive galaxies. The observed apparent 
size evolution in our data generally agrees with results re- 
ported in other studies of massive galaxies (e.g., Daddi et 
al. 2005; Trujillo et al. 2007; Zirm et al. 2007; Toft et al. 
2007; Buitrago et al. 2008; van Dokkum et al. 2008; 2010; 
Williams et al. 2010). 

The ratio in rg of high-redshift galaxies with respect to 
z ~ galaxies, or rg/rg^^^o, can be modeled as a power 
law in redshift of the form a(l + z)^, where a and /3 are 
constants. Using the z ~ massive (M^ > 5 x 10^^ M©) 
MGC galaxies as the normalization, we measure a and j3 
for different subsamples of the massive galaxies and sum- 
marize the results in Table [2l For all galaxies the slope 
P is -1.30 for a fit over z = — 3. For disk-like n < 2 
galaxies /3 is also -1.30, and for n > 2 galaxies /3 is -1.52. 
For non-AGN host galaxies with SFRi^ detected above 
the 5(7 detection limit (see §lj), /3 is -1.21, while for non- 
AGN host galaxies not detected by Spitzer the slope is 
substantially steeper (-1.67). 

These results are comparable to the findings of ear- 
lier studies. Buitrago et al. (2008) show for massive 
(M^ > 10^^ M0) galaxies over z = — 3 that /3 varies from 
-0.8 for n < 2 disk-like galaxies to -1.5 for n > 2 spheroidal 
galaxies. Williams et al. (2010) find P is -0.88 for all mas- 
sive (M^ > 6.3 X lO^^M©) galaxies over z = 0.5 - 2. van 
Dokkum et al. (2010) find a slope of -1.27 for massive 
(M^ > 10^^ Mq) galaxies over z=0-2, which is a good 
match to our slope (-1.30) for massive (M^ > 5 x 10^^ 
Mq) galaxies of all n over z = — 3. Compared to mas- 
sive z ^ galaxies, the implied mean size evolution is a 
factor of ~ 4 from z = 2 — 3 and a factor of ~ 3 from 
z = 1 — 2. In order to determine whether this apparent 
size evolution is real, one needs to address a number of 
systematic effects, as outlined in the next section. 

3.3. Impact of Systematic Effects on Structural 
Properties 

In the previous section we found that the massive galax- 
ies at z = 2 — 3 are strikingly offset toward lower (n, rg) 
compared to the massive ~ galaxies. It is relevant to 
ask whether the large fraction of low (n, rg) systems we 
observe among massive galaxies at z = 2 — 3, compared 
to massive galaxies at z ~ is real or due to a number of 
systematic effects. We address the most important effects 



in the main text and include the others in Appendix [Bl 
We consider the issues hsted below: 

1. Is it possible that the distribution of (n, r^) for 
massive galaxies at z '^ and at z = 2 — 3 is in- 
trinsically similar, but that some selection effects 
at z = 2.5 is making us preferentially detect the 
compact low n systems, thereby causing an arti- 
ficial excess of the latter? We argue that this is 
very unlikely because even if we take all the massive 
compact low n systems at z '^ 0, and appropriately 
scale them for the difference in number density be- 
tween z ^ and z = 2.5, we still would fall way 
short of reproducing the observed number densities 
of compact low n systems. The number density of 
massive (M^ > 1 x 10^^ Mq) galaxies at z = 2.5 
is approximately 30% that at 2; ^ 0. If we take 
the most compact (rg < 2 kpc) and low n < 2 sys- 
tems at z ~ 0, and scale this number by 30%, we 
find a much lower number density (2.8 x 10~^ gal 
Mpc~^), than the observed no density (5.0 x 10~^ 
gal Mpc~^) at z = 2.5 for such compact systems. 

2. Can redshift-dependent systematic effects cause 
structural parameters, such as the high Sersic in- 
dex n of massive galaxies at 2: ~ 0, to 'degrade' 
into the regime of low n < 2 values, measured in 
the 2: = 2 — 3 systems. We address this issue in 

mm 

3. How robust are our fits to the NIC3/F610W images 
of the z = 2 — 3 galaxies? Could some of the galax- 
ies with a best-fit Sersic index n < 2 have similarly 
good fits with much higher n? We show in Ap- 
pendix [B?T] and Appendix IB. 21 that this is unlikely. 
We are confident that the fraction of n < 2 systems 
is not being overestimated. 

4. Can the offset in (n, rg) between the z = 2 — 3 galax- 
ies and the z ^ galaxies be caused by systematic 
differences between the fitting techniques applied by 
us to the NIC3/F610W images of z = 2 - 3 galax- 
ies and the fitting techniques used by Allen et al. 
(2006) on the 5-band images of the massive 2: ~ 
galaxies in MGC? We conduct additional tests (see 
Appendix IB.3[) and conclude that this is also not 
the case. 

3.3.1. Artificial Redshifting 

We next investigate whether redshift-dependent system- 
atic effects could potentially cause the offset in (n, rg) 
shown in Figure [5] between massive galaxies at 2; ~ and 
z = 2 — 3, by causing the (n, rg) of massive 2: ~ galax- 
ies to 'degrade' into the regime of low n < 2 and low rg 
exhibited by the 2; = 2 — 3 systems. 

Ideally one would investigate this question by artifi- 
cially redshifting the entire MGC subsample of 385 mas- 
sive 2; '^ galaxies shown in Figure [5] out to z ~ 2.5, and 
re-decomposing the redshifted galaxies. However, this is 
extremely time consuming, and, furthermore, many of the 
galaxies do not have high quality SDSS ugriz images which 
are needed for redshifting software (FERENGI; Barden 
et al. 2008) to work. We therefore decide to artificially 



redshift a smaller, but representative sample SI of 255 
galaxies. SI consists of 42 massive (M^ > 5 x 10^^ Mq) 
MGC galaxies combined with 213 nearby {z < 0.05) mas- 
sive galaxies having high quality and well-resolved SDSS 
imaging. We ensure the (n, rg) of the 255 galaxies in 
SI match those of the entire subsample of MGC galaxies 
shown in Table [TJ Figure [H and Figure [9l We also ensure 
that the distribution of Hubble types of sample SI matches 
those of the MGC subsample; the MGC subsample con- 
tains ~ 66% E/SO galaxies versus ~ 34% Spirals, while 
sample SI is ~ 64% E/SO galaxies and ~ 36% Spirals. 
Many of the galaxies in SI are well studied and include 
E, SO, and Sabc galaxies from Barden et al. (2008), E 
galaxies in Kormendy et al. (2009), as well as SOs and 
bulge-dominated spirals from Eskridge et al. (2002). 

We used FERENGI (Barden et al. 2008) to artificially 
redshift the SDSS ugriz images (tracing rest-frame UV- 
to-optical light) of z ^ galaxies, out to z = 2.5, and 
re-observe them with the NIC3 F160W filter to the same 
depth as the GNS survey. During this process, FERENGI 
mimics the effects of surface brightness dimming, instru- 
mental resolution, transmission efficiency, and PSF effects. 
It also corrects for other geometrical effects of cosmolog- 
ical redshift by appropriately re-binning input images for 
the desired redshift and platescale. 

Specifically, during artificial redshifting, as is standard 
convention, FERENGI assumes surface brightness dim- 
ming at the rate of {l-\- z)~^ for the bolometric luminosity 
of the full redshifted rest-frame optical SED. For galax- 
ies where only part of this redshifted rest-frame optical 
SED falls within the NIC3/F160W filter bandwidth, the 
observed flux per unit wavelength fx relates to the rest- 
frame luminosity per unit wavelength at redshift z via a 
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dependence (e.g., Weedman 1986). The exact 



surface brightness dimming in such a case will be set by 
the integral of fx over the filter-detector response func- 
tion and depends on the detailed shape of the SED (e.g., 
Hogg 1999; Hogg et al. 2002). In practice, when using the 
FERENGI software, the relevant degree of surface bright- 
ness dimming is automatically applied when FERENGI 
convolves the redshifted images with the NIC3 F160W 
PSF and then re-observes the redshifted SED with the 
NIC3 F160W (H) filter-detector, while taking into account 
the filter-detector characteristics, such as bandwidth and 
transmission efficiency. An exposure time of three-orbits 
(8063 seconds) and a resolution of 0V2/pixel is assumed 
to mimic the GNS survey. A sky background equal to 
the mean sky background of the GNS NIC3 images (0.1 
counts/second) was added to the redshifted images. Pois- 
son noise, sky noise, and read noise (29 e~ for NIC3) were 
then added to the redshifted images. 

During artificial redshifting of local galaxies, it is stan- 
dard procedure to incorporate surface brightness evolution 
(Barden et al. 2008) because galaxies at higher redshifts 
have been observed to have higher mean surface brightness 
after applying the standard correction for the geometri- 
cal effect of cosmological surface brightness dimming. For 
instance, Lilly et al. (1998) find that surface brightness 
for disk-dominated galaxies of similar properties increases 
on average by 0.8 magnitudes by z = 0.7. Barden et al. 
(2005) find from the GEMS ACS survey that galaxies with 
My < — 20 show a brightening of ~ 1 magnitude in rest- 



frame F-band hy z ^ 1. Labbe et al. (2003) find a disk- 
like galaxy with spectroscopic redshift z = 2.03 to have 
a rest-frame 5-band surface brightness ~ 2 magnitudes 
brighter than nearby galaxies. Finally in our own study, 
the mean surface brightness within rg of massive galax- 
ies at z = 2 — 3 is 2 to 6 magnitudes higher than that 
of massive galaxies at z ~ 0, with a mean offset of ~ 4.5 
magnitudes (Figure [7j). 

In our experiment of artificially redshifting massive 
galaxies from z~0toz = 2.5, we applied a conserva- 
tive value of 2.5 magnitudes of surface brightness evolu- 
tion. This value is motivated by several considerations: 
a) 2.5 magnitudes of surface brightness evolution is on 
the conservative side as many of the massive galaxies at 
z = 2.5 show even more evolution (Figure [7]). Thus, us- 
ing this value will not lead to overoptimistic recovery of 
faint features during the experiment; b) The adopted 2.5 
magnitudes of evolution out to z = 2.5 corresponds to one 
magnitude of brightening per unit redshift. This rate of 
brightening is comparable to those seen in studies out to 
z - 2 (Lilly et al. 1998; Barden et al. 2005; Labbe et 
al. 2003); c) Using the Bruzual & Chariot (2003) models, 
one can show that the passive evolution of a single stellar 
population from z = 2.5toz = 0, assuming an exponen- 
tially declining star formation history associated with an 
e- folding time of 100 Myr, will lead the rest-frame B lu- 
minosity to decline by 2.5 to 3 magnitudes, depending on 
the chosen metallicity. 

While we believe that 2.5 magnitudes of surface bright- 
ness evolution is a conservative and reasonable value to use 
during the artificial redshfiting experiment, for the sake of 
completeness, we have also tested the effect of applying a 
surface brightness evolution (brightening) of 0, 1.25, 2.5, 
and 3.75 magnitudes between z ^ and z = 2.5. There is 
a discernible difference in the recovered morphology and 
structural parameters between and 1.25 magnitudes of 
brightening, but less difference between 1.25, 2.5, or 3.75 
magnitudes of brightening. More details on the use of zero 
surface brightness evolution are given in point 4 at the end 
of this section. 

After artificially redshifting SI from z~0toz = 2.5, 
we fit both the original galaxy images and their redshifted 
counterparts with single Sersic profiles. We compare the 
rest-frame optical structural parameters in the original 
and redshifted images in order to assess the influence of 
redshift-dependent systematic effects (e.g., surface bright- 
ness dimming, loss of spatial resolution) and see how well 
the structural parameters are recovered. We also compare 
the redshifted distribution of (n, re) to the one actually 
observed in the GNS massive galaxies to assess whether 
they are similar. Note that the structural parameters are 
measured at 2; ~ from ^-band images, while at z = 2.5 
they are measured from the artificially redshifted images 
in the NIC3/F160W band so that all parameters are mea- 
sured in the rest-frame blue optical light, thereby avoiding 
bandpass shifting problems. Our main results are outlined 
below. 

1. Figure [8] shows the (n, r^) distribution obtained by 
redshifting the sample SI (magenta points in row 
1) of 255 2: ~ massive galaxies to z ~ 2.5 (blue 
points in row 2). This redshifted distribution of (n, 

^^ The Hubble types are based on the bulge-to-total light ratio (B/T), 



Ve) is still significantly offset from those observed in 
the massive GNS galaxies at z = 2 — 3 (red points 
in row 2). 

This difference is shown more quantitatively in Fig- 
ure [9] where results in discrete bins of n and rg are 
compared. The massive galaxies at z = 2 — 3 (red 
line) includes 64.9±5.4% of systems with low n < 2, 
while the corresponding fraction for the redshifted 
sample (blue line) is 10.6 ± 1.9%. Similarly, for the 
Ve distribution of the massive galaxies at z = 2 — 3, 
39.0 ± 5.6% have rg < 2 kpc, while the redshifted 
sample has 1.2 ± 0.7%. We therefore conclude that 
cosmological and instrumental effects are not able 
to account for the large offset shown in Figure [8] 
and Figure [9] between the (n, rg) distributions of 
the massive galaxies at z = 2 — 3 and those at z ~ 0. 

2. It is very interesting to look at how the structural 
parameters of galaxies of different morphological 
types change during the redshifting. Figure [10] com- 
pares the rest-frame optical structural parameters 
in massive E, SO, and spirals at z ^ to the struc- 
tural parameters recovered after these galaxies were 
artificially redshifted. 

From Figure [TOl one can see that rg is recovered 
to better than a factor of 1.5 for the vast majority 
of redshifted E/SO and spirals of early-to-late Hub- 
ble types. In the case of a small fraction of z ~ 
galaxies with highly extended halos or disks and as- 
sociated large rg, the recovered rg at 2: = 2.5 can be 
nearly a factor of two lower than the original rg at 
z ~ 0. Inspection of the surface brightness profiles 
shows that this effect primarily happens because 
surface brightness dimming prevents the outer lower 
surface brightness components of the galaxies from 
being adequately recovered after redshifting. 

It is striking that even after redshifting out to 
z = 2.5, practically none of the massive z ~ 
galaxies fall into the regime of rg < 2 kpc (shown 
as shaded areas) inhabited by the ultra-compact 
systems, which make up r^ 40% (39.0 ±5.6% for 
M^ > 5 X 10^^ Mo and 39.0±7.6% for M^ > 1 x 10^^ 
Mq) of the massive galaxies at z = 2 — 3 (see ^3.2p . 
Thus, these massive ultra- compact (Vg < 2 kpc) sys- 
tems at z = 2 — 3 appear to truly have no analogs 
among z ^ massive galaxies, in terms of their 
size, structure, and optical surface brightness. 

The top row of Figure [10] shows the distribution 
of Sersic index n before and after redshifting out 
to z = 2.5. The recovered Sersic index n can be 
lower or higher than the original n at z ~ 0, but 
is recovered to better than a factor of two in all 
cases. The shaded area in the plots represents the 
regime of n < 2 where the majority (64.9 ±5.4% for 
M^ > 5 X 10^0 Mq and 58.5±7.7% for M^ > 1 x 10^^ 
Mq) of massive GNS galaxies at 2: = 2 — 3 lie (Ta- 
ble [1]). It is interesting to note that massive E 
and SOs, which are spheroid-dominated and bulge- 
dominated systems, do not typically lie in the n < 2 
regime, before or after redshifting. In contrast, a 

which we measured with bulge-disk and bulge-disk-bar decomposition 



large fraction of z ^ spirals with intermediate- 
to-late Hubble typeq3 populate the n < 2 regime, 
both before and after redshifting. Disk features on 
large and small scales (e.g., outer disk or disky pseu- 
dobulge) lead to an overall single Sersic index n < 2 
for the entire galaxy. It is possible that similar disk 
features are responsible at least in part, for the low 
n <2 values shown by the majority (^ Qb%) of the 
massive GNS galaxies atz = 2 — 3. We discuss this 
point further in ^3 

3. One important question is whether the use of deeper 
images of the z ^ galaxies would change the con- 
clusion of the redshfiting experiment. In the present 
experiment, we used SDSS ^-band images, which 
have an exposure time of 54 seconds and a typical 
sky background of 22 mag/arcsec^. Deeper expo- 
sures of nearby galaxies may potentially detect an 
outer low surface brightness halo (if such a halo ex- 
ists), which is missed in the SDSS images, and in 
that case lead us to measure larger (n, re) at z '^ ^ 
with a Sersic fit. Such halos can be found in very 
local massive elliptical and cD galaxies, where the 
measured (n, rg) can increase significantly if the 
halo is included in the fit (Kormendy et al. 2009). 
However, such low surface brightness halos will be 
dimmed out and not recovered during the artificial 
redshifting of these deep images, so that the (n, rg) 
parameters recovered at z = 2.5 will be similar to 
those we presently obtain from the SDSS images. 
The net effect will be that using deeper images of 
local massive galaxies during the artificial redshift- 
ing will at most raise the (n, rg) at 2; ~ 0^ but not at 
z = 2.5. Thus the difference in the (n, rg) at z ~ 
compared z = 2.5 will be unchanged (for systems 
without halos) or amplified (for systems with such 
halos). Our overall conclusion from the redshifting 
experiment regarding degradation of the profiles to 
n < 2 and rg < 2 kpc would remain unchanged or 
be even stronger. 

4. Finally, as one additional test, we repeated the red- 
shifting experiment assuming zero surface bright- 
ness evolution, rather than 2.5 magnitudes of 
brightening, out to z = 2.5. Even in this case there 
is still a large offset in the (n, rg) distributions of 
the redshifted sample SI compared to the massive 
GNS galaxies. Specifically, the fraction of systems 
with low n < 2 (22.0 ± 2.6%) is stih significantly 
less than that for massive GNS galaxies at z = 2 — 3 
(64.9 ± 5.4%). Likewise, there are still few systems 
with rg < 2 kpc (1.6 ± 0.8%) compared to the high 
fraction (39.0±5.6%) found at z = 2 — 3. Thus, even 
without surface brightness evolution it is still true 
that cosmological and instrumental effects are not 
able to account for the large offset between massive 
galaxies at z = 2 — 3 versus z ^ 0. 

4. STAR FORMATION ACTIVITY 

4.1. Matching GNS Galaxies to MIPS 24 fim 
Counterparts 



The Spitzer GOODS Legacy Program (Dickinson et 
al. 2003a; Dickinson et al. in preparation) provides deep 
Spitzer MIPS 24 /im observations of the GOODS fields. 
In the discussion below, we only consider MIPS 24 jam. 
counterparts with f2A^im ^ 30 /iJy, the 5cr fiux limit. The 
MIPS images have a PSF diameter of &' (-- 42 kpc at 
z = 1 - 3), versus the NIC3/F160W PSF of 0^3. MIPS 
24 /im counterparts of the massive GNS galaxies were iden- 
tified by selecting the closest MIPS 24 jiia source within a 
maximum matching radius of 1V5. We initially find 84/166 
massive GNS galaxies with MIPS 24 /im counterparts with 
/24/im ^ 30 /iJy and further refine these matches below. 

There are several potential problems with the above pro- 
cedure. Firstly, it allows for the situation where a given 
MIPS 24 /im source could be matched to several massive 
GNS galaxies. This would happen if some massive GNS 
galaxies were crowded within a radius of a few arcseconds 
so that the MIPS source would be within V!b of all of 
them. This situation occurs for 2/84 (~ 2%) of massive 
galaxies with a MIPS counterpart. We reject these two 
cases, reducing the number of unique and secure matches 
from 84 to 82. 

A second possible caveat is that within the large MIPS 
24 /im PSF of Q" diameter, there may be several other 
NIC3/F160W sources, in addition to the main massive 
GNS galaxy to which the MIPS source is matched. These 
extra NIC3/F160W sources may even be lower mass galax- 
ies not in our sample of massive (M^ > 5 x 10^^ ^0) 
galaxies. In such a scenario, all the extra NIC3 sources 
could potentially contribute to the MIPS 24 /im fiux, and 
assigning all the 24 /im fiux of the MIPS counterpart to the 
nearest massive GNS galaxy would overestimate the 24 /im 
fiux of this galaxy. In order to assess the extent of this 
potential problem, we proceed as follows. For the MIPS 
24 /im counterpart assigned previously to each massive 
GNS galaxy, we determine how many extra NIC3/F160W 
sources with M^ > 10^ M©, in addition to the massive 
GNS galaxy, lie within a circle of diameter Q" (i.e., the 
PSF diameter) centered on the MIPS source. Of the 82 
massive GNS galaxies with a secure MIPS 24 /im counter- 
part, 30 involve cases where there are extra NIC3 sources, 
along with the massive GNS galaxy, inside the MIPS PSF 
diameter. 

Next, we estimate the relative expected contributions 
of the massive GNS galaxy and the extra NIC3/F160W 
sources to the overall 24 /im fiux by using the stellar mass 
ratio of the main massive GNS galaxy (e.g., M^i) and of 
the contaminating source (e.g., M^2), scaled by a function 
that takes into account the different redshifts of the two 
sources. Specifically, for the two sources with stellar mass 
M^i and M^2, having redshifts zi and Z2 and luminosity 
distances Dli and I^l2, the stellar mass ratio M^i/M^2 
is scaled by ((1 + Z2) x Dl^)/{(1 + zi) x Dl-^). In 8 of 
30 cases, the contribution of the extra NIC3 contaminat- 
ing sources to the overall 24 /im fiux is > 20% that of the 
main GNS galaxy, and spans ~ 40% to ~ 126%. We reject 
these latter 8 cases rather than try to correct for the con- 
tamination, which in all cases is distributed across two or 
more nearby galaxies. For the remaining 22 cases, the con- 
tamination by extra NIC3/F160W sources is < 20% and 
we deem that our afore-described procedure of assigning 
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all the 24 /am. flux of the MIPS counterpart to the massive 
GNS galaxy is reasonable. 

Therefore, in summary, 74/166 (44.6 ± 3.9%) massive 
(M^ > 5 X 10^0 Mq) GNS galaxies have a rehable MIPS 
24 jam counterpart (with /24/im ^ 30 /iJy) whose entire 
flux is assigned to the massive GNS galaxy. In contrast, 
82/166 (49.4 ± 3.9%), massive GNS galaxies do not have 
a reliable MIPS counterpart with /24/im ^ 30 /iJy and in 
these cases we can only measure upper limits on their SFR. 
Table [3] lists the fraction of massive GNS galaxies with a 
MIPS 24 jam counterpart as a function of redshift. 

4.2. Star Formation Rates 

In order to estimate the SFR, the total IR luminosity 
(I/ir) over 8-1000 jim. is first estimated from the observed 
24 jim. flux (corresponding to rest-frame wavelengths of 
6-12 jim. over z = 1 — 3) by using SED templates from 
Chary & Elbaz (2001). Using solely 24 jim. flux density to 
measure Ljr works well for inferred Ljr < 10^^ Lq galax- 
ies at z ~ 2, but LiR is overestimated by a factor of ~ 3 
in more luminous galaxies (e.g., Papovich et al. 2007). 
Early results from Herschel (e.g., Elbaz et al. 2010; Nor- 
don et al. 2010; D. Lutz, private communication) suggest 
that at z > 1.5, the SFRs extrapolated from 24 jim. fluxes 
may overestimate the true SFR, typically by a factor of 2 
to 4 and possibly as much as a factor of 10. This over- 
estimate could be due to a rise in the strength of PAH 
features, changes in the SEDs, or AGN contamination at 
z > 1.5. Murphy et al. (2009) find that estimates of Ljr 
from 24 jim. flux density alone are incorrect because the 
templates used are based on local galaxies with smaller 
PAH equivalent widths than galaxies of similar luminosity 
at high-redshift. We account for this discrepancy by mak- 
ing a correction for galaxies with inferred Ljr > 6 x 10^^ 
Lq using 

logio(LiR) = 0.59 X log^oiUn) + 4.8, (3) 

where Lj^ is the infrared luminosity inferred solely from 
24 jim. flux density (R. Chary, private communication). 
The upper-left and upper-right panels of Figure [11] show 
the distribution of 24 jam. flux and the inferred Lir. 

The obscured star formation rate can be calculated us- 
ing the expression 

SFRiR = 9.8 X 10-^^LiR (4) 

from Bell et al. (2007). This calculation is based on a 
Chabrier IMF (Chabrier 2003) and assumes that the in- 
frared emission is radiated by dust that is heated primarily 
by massive young stars. Uncertainties in the SFR esti- 
mates are a factor of '^ 2 or higher for individual galaxies. 
If an AGN is present, then SFRir only gives an upper 
limit to the true SFR. In §6l we adopt several techniques 
to identify AGN candidates in the sample and estimate 
the mean SFR for galaxies with and without a candidate 
AGN (see Table [3|). The upper-right panel of Figure E] 
shows LiR for AGN and non-AGN, and the bottom pan- 
els show SFRiR. The AGN candidates dominate the tail 
of highest LiR and SFRir. Among the HyLIRCQ 9/11 
(- 82%) turn out to be AGN. After excluding the AGN 
candidates, the mean Ljr is a factor of ^ 8 times lower, 
while the mean SFRir is reduced a factor of ~ 1.5 to ~ 2.7, 
and the difference rises with redshift (Table [3j). 

14 HyLIRGs are defined to have Ljr > lO^^ Lq 



How do our measurements of SFRir compare with UV- 
based SFR derived in other studies of high-redshift galax- 
ies? The left panel of Figure [T2l plots SFRir versus M^ for 
the massive (M^ > 5 x 10^° M©) GNS galaxies at z = 2-3 
with 24 jim. flux above the 5cr limit (30 /iJy). We demon- 
strate that the SFR derived at z = 2 — 3 for non-AGN are 
in approximate agreement with the UV-based SFR from 
Daddi et al. (2007). Drory & Alvarez (2008) parameterize 
SFR as a function of mass and redshift for a wide range 
in stellar mass (M^ ~ 10^ — 10^^ ^o)- In the right panel 
of Figure [T2I the black line shows average SFR versus red- 
shift for a 5 X 10^^ Mq galaxy as calculated by Drory & 
Alvarez (2008). The mean SFRir for massive non-AGN 
GNS galaxies, with SFRir above the ba limit, are higher 
by a factor of ~ 1.5 — 4 over 2; = 1 — 3, with the off- 
set worsening with redshift. This disagreement with mean 
SFRir is not just a bias caused by the requirement that 
SFRir exceed the 5cr limit, which selects the most intense 
star- forming systems at each redshift. Even if the upper 
limits on SFRir are included, our SFRir do not show the 
same break and flattening seen at 2: ~ 2 by Drory & Al- 
varez (2008). Finally, Bauer et al. (2011) measure dust- 
corrected UV-based SFR (SFRuv,corr) for galaxies in GNS 
over 1.5 < z < 3. Among massive (M^ > 5 x 10^^ Mq) 
galaxies, SFRuv,corr can differ by as much as a factor of 
10, but for higher SFRir the difference is typically a factor 
of - 2 - 3. 

4.3. Relation Between Star Formation and Structure 

Figure [13] shows the distribution of SFRir among sys- 
tems of different n. On the LHS panel, galaxies with 
SFRir below the ba detection limit are shown as down- 
ward pointing arrows. The potential AGN candidates 
identified in ^ are coded separately as SsfRir is likely 
overestimating the true SFR in the galaxy. For the his- 
tograms on the RHS panel, the y-axis shows the fraction 
of massive GNS galaxies in each redshift bin, while on the 
X-axis, we plot the actual value of SFRir for systems with 
SFRir above the 5<j detection limit (indicated by the ver- 
tical line), and the upper limit for the other systems. 

The massive galaxies at 2; = 1 — 3 display several inter- 
esting relations between their star formation activity and 
structure, as characterized by the Sersic index n. Firstly, 
among the non-AGN massive (M^ > 5 x 10^^ ^0) galaxies 
at z = 2 — 3, the fraction of galaxies with low n < 2 hav- 
ing SFRir high enough to produce a 24 jim. flux above the 
5(7 detection limit is (53.4±10.9%), which is significantly 
higher than the corresponding fraction (15.4 ± 10.0%) for 
systems with n > 2. Secondly, among the non-AGN mas- 
sive (M^ > 5 X 10^° Mq) galaxies at z = 2 - 3 with SFRir 
above the 5<j detection limit, the majority (84-6 ±10.0%^ 
have low n <2, while none have n > 4. The correspond- 
ing numbers for the redshift bin z = 1 — 2 are 67.7 ± 8.0% 
and 11.8 ± 5.5%, respectively. Thirdly, the RHS panel of 
Figure [13] shows that the high SFR tail in each redshift bin 
is populated primarily by n < 2 systems. While the n <2 
disky systems have a wide range of SFRir (21 to 626 Mq 
yr-^ at z = 1 - 2, 53 to 1466 M© yr-^ at z = 2 - 3), they 
include the systems of the highest SFR at both z = 1 — 2 
and z = 2 — 3. Thus, the systems with low n < 2 seem 
to be more actively star-forming than the systems of high 
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n > 3. 

Most (72.0 ± 6.3% of systems with low n < 2 are ex- 
tended (re > 2 kpc) so that a relation is also expected be- 
tween SF activity and size. We thus investigate next the 
relationship between SFR and effective radius rg. The dis- 
tribution of SFRiR for different rg ranges is shown in Fig- 
ure [HI The same convention as for Figure [13] is adopted, 
with upper limits being plotted for galaxies with SFRir 
below the 5cr detection limit, and only non-AGN systems 
being plotted on the RHS panel. We find that among 
the non-AGN massive (M^ > 5 x 10^^ Mq) galaxies at 
z = 2 — 3, the fraction of ultra-compact (rg < 2 kpc) 
objects with SFRir above the ba detection limit is only 
15.0 ±8.0% compared to the fraction (32.4 ±8.0%) for the 
whole sample. Thus, among non-AGN massive galaxies 
over z = 2 — 3, the ultra- compact (Vg < 2 kpc) galaxies 
show a deficiency by a factor of ^ 2.2 of systems with 
SFRiYi above the detection limit, compared to the whole 
sample. At z = 1 — 2, the deficiency is a factor of ~ 3.5. 
Furthermore, as illustrated by the RHS panel of Figure 
[EJ although there are some ultra-compact (rg < 2 kpc) 
galaxies with high SFRir, on average, the mean SFRir of 
the z = 2 — 3 and z = 1 — 2 is significantly lower than that 
of more extended galaxies. 

5. CONSTRAINTS ON COLD GAS CONTENT 

The high estimated SFRir found in §31 suggest that co- 
pious cold gas reservoirs are present to fuel the star for- 
mation. For the massive GNS galaxies with SFRir mea- 
surements above the 5cr detection limit, we assume half 
of SFRir lies within the circularized rest-frame optical 

half-light radius (rg = rg x y/b/a) from single component 

Sersic fits, and thereby estimate that the deprojected SFR 
per unit area as 

0.5 X SFRir , ^ 

^SFRiR = —-2 • (Oj 

TT X r^ 

In galaxies that AGN host candidates, SsfRir is likely 
overestimating the true SFR in the galaxy (see 21 • If po- 
tential AGN candidates are included, SsfRir ranges from 
~ 0.10 — 360.8 Mq yr~^ kpc~^, with a mean value of 
~ 19.4 Mq yr~^ kpc~^ over z = 1 — 3. After exclud- 
ing the potential AGN candidates SsfRir ranges from 
~ 0.24 — 360.8 Mq yr~^ kpc~^, with a mean value of 
~ 14.8 Mq yr~^ kpc~^. This range is comparable to that 
seen in BzK/normal galaxies, ULIRGS, and submillimeter 
galaxies (e.g., see Daddi et al. 2010b). 

We use a standard Schmidt-Kennicutt law (Kennicutt 
1998), with a power-law index of 1.4 and a normalization 
factor of 2.5 x 10~^, to estimate the cold gas surface den- 
sity from S SFRir- The results are uncertain by at least a 
factor of ~ 2.5 because different relations between molecu- 
lar gas surface density and SFR surface density have been 
suggested for various types of star-forming systems over 
a broad range of redshifts (Kennicutt 2008; Gnedin & 
Kravtsov 2010; Daddi et al. 2010b; Genzel et al. 2010; 
Tacconi et al. 2010). If potential AGN candidates are 
included, the resulting implied cold gas surface density 



■^gas 



10"^ X EsFRip 



1/1.4 
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25, 091 Mq pc~^, with a median value 
2 over z = 1 - 3 (Figure ^. The 



corresponding values after excluding AGN candidates are 
~ 136 — 25091 Mq pc~^, with a median value of ~ 607 
Mq pc~^ (Figure [T5]). These values are again comparable 
to those observed in BzK/normal galaxies, ULIRGS, and 
submillimeter galaxies (e.g., see Daddi et al. 2010b). 

In the subsequent discussion, we only cite values ob- 
tained after excluding AGN candidates, but Figure [15] also 
shows the values for the full sample of galaxies. Next we 
estimate the cold gas fraction relative to the baryonic mass 
within Tg. For each galaxy, we use the above cold gas sur- 
face density to estimate the total cold gas mass within the 
circularized rest-frame optical half-light radius. 



Mgas(rg) 



^gas X TT X r^ 



(7) 



Mgas ranges from 3.4 x 10^ — 1.0 x 10^^ Mq, with a mean 
value of 1.9 x 10^^ Mq (Figure [15]). The baryonic mass 
(MBaryon) withiu Tg is taken to be the sum of cold gas mass 
and stellar mass within rg, and we assume that the latter 
term is half of the total stellar mass of the galaxy. 

The cold gas fraction (/gas(^c)) within the circularized 
rest-frame optical half-light radius rg is defined as 



/gas(rg)= Mgas/ [Mgas + M^]. 



(8) 



The cold gas fraction (/gas(^c)) ranges from 6.5 — 65.4%, 
with a mean of ~ 23% over z = 1 — 3 (Figure [15]). Figure [16] 
shows how /gas(^c) varies as a function of stellar mass and 
redshift, both with and without the AGN candidates. For 
galaxies with 5 x 10^° Mq < M^ < 10^^ Mq above the 
5<j detection limit, the mean /gas(^c) (without AGN can- 
didates) rises from ~ 19% to ~ 25% to ~ 41% across the 
three redshift bins. In comparison, for M^ > 10^^ Mq 
galaxies, the mean cold gas fraction is ~ 14% to ~ 23%. 
The la error bars are large and there is considerable over- 
lap between the two mass ranges. Still, the highest cold 
gas fractions within the circularized rest-frame optical half- 
light radius at a given redshift are found among the less 
massive galaxies, consistent with downsizing. 

Our inferred cold gas fractions (/gas(^c)) within the cir- 
cularized rest-frame optical half-light radius rg can be 
higher or lower than the total cold gas fraction of the 
galaxy, depending on whether the molecular gas is cen- 
trally concentrated or extended, respectively. While bear- 
ing this caveat in mind, we note that our inferred values for 
/gas(^c) are consistent with previous direct measurements 
of the total cold gas fraction at high-redshift. Daddi et 
al. (2008, 2010a) report gas fractions of 50-65% in mas- 
sive (M^ - 4 X 10^0 - 1 X 10^^ Mq) IR-selected BzK 
galaxies at 2; ~ 1.5. Tacconi et al. (2010) also measure 
cold gas fraction from CO observations of high-redshift 
galaxies at z = 1.1 — 2.4. For stellar masses spanning 
M^ - 3 X 10^° - 3.4 X 10^^ Mq, they find cold gas frac- 
tions in the range of ~ 14 — 78%. 

6. AGN IN MASSIVE GALAXIES AT Z = 1 - 3 

6.1. Frequency of AGN 

We use a variety of techniques (X-ray properties, IR 
power-law, IR-to-optical excess, and mid-IR colors) to 
identify Active Galactic Nuclei (AGN) among the massive 
GNS galaxies because selection based on X-ray emission 
alone may fail at high redshift in the case of Compton- 
thick AGN where much of the soft X-ray emission is 
Compton scattered or absorbed by thick columns of gas 
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{Nh > lO^"^ cm-2; Brandt et al. 2006). We briefly sum- 
marize here and in Table |4] the number of AGN identified 
by each of the selection methoda^. 

1. X-ray counterparts to the massive GNS sources 
were searched for in the CDF-N and CDF-S catalogs 
of Alexander et al. (2003) and Luo et al. (2008), 
as well as the ECDF-S catalogs of Lehmer et al. 
(2005). A total of 33/166 massive GNS galaxies 
had counterparts within IV 5 across all catalogs. 

2. Following Alonso-Herrero et al. (2006) and Donley 
et al. (2008), we look for AGN power-law emission 
over z = 1 — 3 using SEDs from the IRAC bands 
at 3.6, 4.5, 5.8, and 8.0 /im. The IRAC SEDs were 
fit with a power-law SED (/^ ex u^). There are 
only 3/166 sources with power-law index a < —0.5 
that are considered power-law galaxies (PLGs) and 
obscured AGN candidates. 

3. Heavily obscured AGN may be present in highly 
reddened, IR-excess galaxies. Fiore et al. (2008) 
identify obscured AGN candidates in IR-bright, op- 
tically faint, red galaxies over z = 1.2 — 2.6 using 
the criteria f2A^im/ fn > 1000 and R-K > 4.5. We 
search for such IR-bright, optically faint systems 
with f2A^m/ fn > 1000 in our sample of massive 
galaxies. i?-band flux is determined by linear in- 
terpolation between the ACS V and i-band fluxes. 
We flnd 25 sources meeting this criteria, of which 
16 are new AGN candidates not identifled via the 
above two methods. 

Among the 166 massive GNS galaxies at z = 1 — 3, 
the AGN fraction is 49/166 or 29.5 ± 3.5%. When the 
results are broken down in terms of redshift, the AGN 
fraction rises with redshift, increasing from 17.9 ±6.1% at 
^ = 1 - 1.5 to 40.3 ± 8.8% at z = 2 - 3. The percentage 
of AGN among all massive GNS galaxies is higher than at 
z ~ 1, where it is reported that less than 15% of the total 
24 /im emission at z < 1 is in X-ray luminous AGN (e.g., 
Silva et al. 2004; Beh et al. 2005; Franceschini et al. 2005; 
Brand et al. 2006). 

6.2. Relation Between AGN Activity and Structure 

We summarize the properties of the AGN host candi- 
dates and discuss their implications in terms host galaxy 
structure. 

Figure [T71 shows the single Sersic n versus rg. Most 
(80.6 ± 7.9%) of the AGN hosts at z = 2 - 3 have re > 2 
kpc and are not ultra-compact. AGN appear to be found 
preferentially in the more extended galaxies. Indeed, at 
z = 2 — 3, the AGN fraction in ultra-compact galaxies is 
^ 2.7 times lower than in extended galaxies (20.0 ± 16.3% 
versus 53.2 ± 10.0%). At z = 1 — 2 the deflciency is a 
factor of 5.6. Thus, the ultra-compact galaxies are more 
quiescent in terms of both AGN activity and SFR activity 
(see g]). 

Furthermore, a signiflcant fraction of these AGN (64.6 ± 
10.7%) have disky (n < 2) morphologies. Over half 

^^ The mid-IR selection criteria of Lacy et al. (2004) and Stern et al. 
high-redshift star-forming galaxies drastically reduces their accuracy 
add more false-positives than true AGN. 



(58.2 ± 11.6%) of the AGN candidates are both disky and 
not ultra-compact. Similar statistics apply over z = 1 — 2. 
The disky nature of AGN hosts at 1.5 < 2: < 3 has been 
measured previously by Schawinski et al. (2011). From 
decomposition of the rest-frame optical light for 20 AGN 
imaged with HST WFC3, they measure a mean Sersic 
index of 2.54 and a mean effective radius of 3.16 kpc. 
Their results for (n, rg) are consistent with our results 
for z = 2 — 3 in Table 3] and Figure [T71 Furthermore, Ko- 
cevski et al. (2011, in prep.) flnd from visual classiflcation 
of rest-frame optical morphologies that 51.4l5'9 of X-ray 
selected AGN {Lx - 10^^-44 ^^g g-i) ^^ ^ 5 < z < 2.5 
reside in galaxies with visible disks; only 27.4l4g reside in 
pure spheroids. 

If the disky AGN host candidates host massive black 
holes, then massive black holes are present in galaxies that 
are not dominated by a massive spheroid. In the local 
Universe, nearly all massive galaxies are believed to host a 
central supermassive black hole (Kormendy 1993; Magor- 
rian et al. 1998; Ferrarese & Merritt 2000; Gebhardt et 
al. 2000; Marconi & Hunt 2003), and the black hole mass 
is tightly related to the bulge stellar velocity dispersion 
(Ferrarese & Merritt 2000; Gebhardt et al. 2000). This 
has led to the suggestion that the black hole and bulge 
or spheroid probably grew in tandem (e.g., Cattaneo & 
Bernardi 2003; Hopkins et al. 2006). The presence at 
z = 2 — 3 of luminous and potentially massive black holes 
in high mass galaxies that do not seem to have a prominent 
bulge or spheroid may be at odds with this picture. 

7. DISCUSSION 

7.1. Do Massive Galaxies With n < 2 at z = 2 — 3 Host 

Disks? 

We have shown in g321that the majority (64.9% ± 5.4% 
for M^ > 5 X 10^0 M0, and 58.5% ± 7.7% for M^ > 10^^ 
Mq) of massive galaxies at 2; = 2 — 3 have low n < 2, while 
the fraction at z ~ is flve times lower. We also demon- 
strated via artiflcial redshifting experiments and extensive 
tests ( ^3.3l and the Appendix) that this difference between 
z = 2 — 3 and 2: ~ is real and not driven primarily by 
systematic effects. Furthermore, most (~ 72%) of these 
with low n < 2 massive galaxies at z = 2 — 3 are extended 
with Te > 2 kpc, rather than being ultra-compact. 

What is the nature of the large population of galaxies 
with low n<2atz = 2 — 3? We present below different 
lines of evidence which suggest that many of these mas- 
sive galaxies at 2: = 2 — 3 with n < 2, particularly the 
extended (rg > 2 kpc) systems, likely host a signiflcant 
disk component. 

1. Some insight into the interpretation of n < 2 val- 
ues can be gleaned by considering massive galaxies 
at z ~ 0. As discussed in ^3.3.11 and illustrated in 
Figure [TOl massive E and SOs, which are spheroid- 
dominated and bulge-dominated systems, are pre- 
dominantly associated with n > 2, both at z ~ 
and after artiflcially redshifting to 2; = 2.5. In con- 
trast, spiral galaxies of intermediate to low bulge- 
to- total ratios, often have an overall low Sersic index 

(2005) were investigated but considered unreliable. Contamination from 
(e.g., Donley et al. 2008). Applying these methods at z = 1 — 3 would 
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n < 2 (Figure [10]) because they have a disk compo- 
nent, such as an outer disk or a central disky pseu- 
dobulge (e.g., Kormendy & Kennicutt 2004; Jogee 
1999; Jogee et al 2005), which contributes signifi- 
cantly to the total blue light of the galaxy. An ex- 
tension of these arguments to z = 2 — 3 suggests the 
large fraction ^ 65% of massive galaxies at z = 2 — 3 
with low n < 2 is driven, at least partially, by the 
presence of an outer disk or central disky pseudob- 
ulge. 

2. We next consider the relationship between disk 
structure and projected ellipticity e. The top pan- 
els of Figure [18] show the deconvolved ellipticity 
e = 1 — b/a determined by GALFIT for the massive 
(M^ > 5 X 10^° Mq) galaxies at z = 2 - 3 with 
n < 2 and n > 2. The lower left and right panels 
of Figure [18] show the distributions of deconvolved 
ellipticity determined by GIM2D of similarly mas- 
sive spiral (Sabc and Sd/Irr) and E/SCO galaxies 
in the MGC catalog. 

The projected ellipticity distribution of massive 
galaxies at z = 2 — 3 with n < 2 is quite differ- 
ent from that of z ^ massive E/SO galaxies. For 
local E/SOs, the distribution of e drops sharply at 
e > 0.35 and there are few systems at e > 0.5. 
In contrast, for the massive galaxies at z = 2 — 3 
with n < 2, the e distribution continues to rise 
out to e ~ 0.5. There is also a significant frac- 
tion (~ 58%) of systems with n < 2 having e above 
0.5, specifically in the range of 0.5 to 0.75. In effect, 
a Kolmogorov-Smirnov (KS) test ( Table [5]) shows 
that the galaxies at 2; = 2 — 3 with n < 2 have 
a 0% KS-test probability of coming from the same 
distribution as local massive E/SOs in MGC. These 
comparisons suggest that the massive galaxies at 
z = 2 — 3 with n < 2 are very different from local 
bulge- dominated and spheroid- dominated E/SOs. 

Among the massive systems with n<2at2; = 2 — 3, 
28.0 ± 6.4% are ultra-compact (rg < 2 kpc). Thus, 
our conclusion complements the results of van der 
Wei et al. (2011) who analyze WFC3 images of a 
small sample of 14 massive (M^ > 6 x 10^^ ^0), 
quiescent, and compact (rg < 2 kpc) galaxies at 
1.5 < z < 2.5 and report that most (65 ± 15%) 
are disk-dominated systems They find that 5 of 14 
galaxies are flat in projection and have an ellipticity 
> 0.45. 

What is the nature of the massive galaxies at z = 
2-3 with n < 2? Figure [H] and the KS tests in 
Table [5] show that the massive galaxies at z = 2 — 3 
with n <2 are more similar to z ~ massive Sd/Irr 
(KS probability of 23.5% and D = 0.317) and to 
z r^ massive Sabc spirals (KS probability of 4.8% 
and D = 0.221) than to z ^ massive E/SOs. How- 
ever, the similarity to massive late- type spirals at 
z ~ is clearly limited, since most massive galaxies 
at z = 2 — 3 with n < 2 have smaller half-light radii 
(re primarily below 7 kpc; Figure [5]) than any of 
the z ^ massive systems. It is possible that they 

^^ The MGC catalog assigns the 'E/SO' Hubble type and unfortunately 



host less extended and thicker disks than present- 
day massive spirals. 

Another possibility is that the massive galaxies at 
z = 2 — 3 with n < 2 might be related to clump- 
cluster and chain galaxies (Cowie et al. 1995; van 
den Bergh et al. 1996; Elmegreen et al. 2005, 
2009a, 2009b). Such galaxies very often host disk 
structures (Elmegreen et al. 2009a), and many of 
them appear to represent a population of highly 
clumped disk galaxies viewed at different orienta- 
tions (Elmegreen et al. 2008; Elmegreen et al. 
2005). While clumpy disks may be among the mas- 
sive GNS galaxies with low n < 2, we cannot iden- 
tify them due to resolution effects. Finally, we note 
that in principle a low Sersic index could be the re- 
sult of a merger that has not fully coalesced. How- 
ever, as noted in ^3.21 most massive GNS galaxies 
do not visually appear to be made of multiple dis- 
torted systems in early phases of mergers. Artificial 
redshifting of present-day interacting systems show 
that our GNS images should be able to resolve sys- 
tems in early phases of merging, such as NGC4568 
and NGC 3396, but would be unlikely to resolve late 
merger phases, such as Arp 220 into two separate 
systems. 

3. Another line of evidence for massive galaxies at 
z ^ 2 with potentially thick disks comes from the 
SINS survey (Genzel et al. 2008; Shapiro et al. 
2008; Forster Schreiber et al. 2009), which pro- 
vides ionized gas kinematics of z ^ 2 star-forming 
galaxies and finds examples of clumpy, turbulent, 
and geometrically thick systems having high veloc- 
ity dispersions (a ~ 30 — 120 km/s). About ~ 1/3 of 
such systems show rotating disks kinematics. Fur- 
thermore, Forster Schreiber et al. (2011) find from 
HST NIC2 imaging that five star-forming galaxies 
with rotating disk kinematics are well-characterized 
with shallow n < 1 Sersic profiles. Compared to 
these SINS galaxies, the massive GNS galaxies at 
z = 2 — 3 are more massive on average. 

4. In this work (SI]), we fitted the NIC3/F160W im- 
ages of the massive galaxies at z = 2 — 3 with sin- 
gle Sersic components, rather than separate bulge 
and disk components because the low resolution 
(PSF FWHM of 0'.'3 corresponding to - 2.4 kpc 
at 2: = 1 — 3) of the images prevent reliable mul- 
tiple component decomposition for all the galaxies, 
particularly the fairly compact ones. However, for 
the galaxies with large rg > 4 kpc we attempted 
a bulge-plus-disk decomposition following the tech- 
niques outlined in Weinzirl et al. (2009). The 
decomposition was reliable only for the more ex- 
tended systems within this group and yielded bulge- 
to-total light ratios below 0.4, indeed suggesting 
the presence of a significant disk component among 
massive galaxies at 2: = 2 — 3 with n < 2. 

5. It is also interesting to note that most ('^72% for 
M^ > 5 X 10^^ ^0) of these massive galaxies at 
z = 2 — 3 with low n < 2 are extended (re > 2 kpc) 

does not allow us to identify Es separately. 
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rather than ultra-compact systems. This is in itself 
does not prove that disk components exist in low 
n < 2 systems, but it is suggestive of such a picture. 
Furthermore, we found in ^4.31 that at z = 2 — 3, 
the n < 2 disky systems have a wide range of SFRir 
and include systems of the highest SFRir. This re- 
sult is generally consistent with the idea that the 
systems with n < 2 are actively star-forming and 
host copious amounts of gas (3S]), which tends to 
settle in disk-like configurations. 

6. For completeness, we note that in principle the pres- 
ence of a massive disk component is not the only 
way to produce a low Sersic index n < 2 in mas- 
sive galaxies at z = 2 — 3. For the ultra-compact 
(re < 2 kpc) massive galaxies with n < 2, it has 
been argued that such systems could be somewhat 
like a massive elliptical, which has a bright high sur- 
face brightness central component surrounded by 
a very extended low surface brightness envelope. 
If the low surface brightness envelope is somehow 
not detected by the NIC3/F160W images, then the 
latter could yield a lower n < 2, as the wings of 
the surface brightness profile would be effectively 
clipped. However, this scenario does not seem likely 
since our artificial redshifting experiments f ^3.3.ip 
show that z ^ massive Es are not degraded into 
ultra-compact systems. Furthermore, Szomoru et 
al. (2010) confirm the absence of a low surface 
brightness halo in an ultra-compact, massive galaxy 
at z = 1.9 from extremely deep {H ~ 28 mag 
arcsec"^) WFC3 imaging. 

In summary, based on all the above tests and argu- 
ments, we conclude that the massive galaxies at 2; = 2 — 3 
with n < 2, particularly the more extended systems with 
Te > 2 kpc, likely host a massive disk component, which 
contributes significantly to the rest-frame blue light of the 
galaxies. 

7.2. Formation of Massive Galaxies By z = 2 — 3 

How do the massive galaxies with ultra-compact (rg < 2 
kpc) and low n < 2 disky structures form by 2; = 2 — 3? 
The surface brightness in the rest-frame 5-band of the 
massive galaxies at 2; = 2 — 3 is on average 4.5 magnitudes 
brighter than massive z ^ galaxies (Figure [7|). This im- 
plies a large mass surface density of young-to-intermediate- 
age stars had to built up in less than a few Gyr. Implied 
stellar mass surface densities exceed several lO^^M© pc~^ 
even for conservative mass-to-light ratios. This implies 
that rapid and highly dissipative gas-rich events must have 
led to the formation of these massive galaxies byz = 2 — 3. 
Both gas accretion and wet major mergers at z > 2 are 
likely to have played an important role because at such 
high redshifts, the short dynamical timescales associated 
with mergers, and the short cooling time associated with 
gas accretion imply that both mechanisms would lead to 
a rapid buildup of cold gas. The latter can in turn lead 
to rapid star formation and dense stellar remnants (e.g., 
Wuyts et al. 2009; Wuyts et al. 2010; Khochfar & Silk 
2011; Bournaud et al. 2011). 

A further constraint on the formation pathway is pro- 
vided by the structure of the massive galaxies at z = 2 — 3. 



We have shown in ^3.2l that as much as ^ 65% of the mas- 
sive galaxies at z = 2 — 3 have a low n < 2, and we further 
argued in ^7.11 that most of these systems with n < 2 at 
z = 2 — 3 likely host a massive disk component. Major 
mergers of low-to- moderate gas fraction (e.g., < 30%) will 
typically produce merger remnants with a de Vaucouleurs 
type profile and a Sersic index n > 3 (Naab, Khochfar, 
& Burkert 2006; Naab & Trujillo 2006). Mergers with 
moderate-to-high gas fractions are expected to produce 
lower Sersic n that are still in general > 2. For instance. 
Figure 14 of Hopkins et al. (2009) show the Sersic index of 
major merger remnants for a range of orbits and a range 
of progenitors with gas fractions spanning from 10-100%. 
Although some massive (M^ > 10^^ Mq) remnants with 
n ~ 1 arise in mergers with /gas > 80%, most remnants of 
gas-rich (/gas ^ 40%) mergers have a Sersic index n > 2. 
Furthermore, Rothberg & Joseph (2004) find from if-band 
imaging of 52 merger remnants that '^51% (26/51) have 
n > 3, '^ 37% (19/51) have n ~ 2 — 3, and only a small 
fraction (~ 12%, 6/51) have n '^ 1 — 2. Thus, when con- 
sidering isolated gas-rich major mergers, namely those not 
fed by cold streams, it is challenging to produce a popula- 
tion of merger remnants where ~ 65% of the systems have 
n<2. 

The challenge of producing a large population of disky 
(n < 2) systems with high SFRs from isolated gas-rich 
major mergers may be an indication that the accretion of 
cold gas along cosmological filaments ((Birnboim & Dekel 
2003; Keres et al. 2005; Dekel & Birnboim 2006; Dekel et 
al. 2009a; Dekel et al. 2009b; Keres et al. 2005; Keres et 
al. 2009; Brooks et al. 2009; Ceverino et al. 2010) may be 
particularly important in the build-up of massive galaxies 
by z = 2 — 3. As merger remnants at z > 2 acquire gas 
via cold-mode accretion, a gas disk is expected to form 
(Khochfar & Silk 2009a; Burkert et al. 2010). Depending 
on the angular momentum of the accreted gas, it can settle 
into a compact disk component or into an outer extended 
disk. Burkert et al. (2010) discuss a scenario where tur- 
bulent rotating disks can form, segregating into compact 
{re ~ 1 — 3 kpc) dispersion-dominated (1 < v/a < 3) 
systems and more extended (rg '^4 — 8 kpc), rotation- 
dominated {v/a > 3) disks. The formation of a gas disk 
via cold-mode accretion and its subsequent conversion into 
a stellar disk, would lower the overall Sersic index of the 
massive galaxies at z = 2 — 3, making them more in line 
with the observed values. 

However, many key questions remain unanswered. Can 
theoretical models account for the observed fractions of 
massive galaxies with low n < 2, as well as the fraction 
of galaxies with ultra-compact (rg > 2 kpc) sizes? Can 
the relation between structure, SFR, and AGN activity 
discussed in ^4.31 and ^6.2[ as well as the range in SFR at 
a given stellar mass, be accounted for? We will address 
these questions in a future paper (Jogee et al., in prepara- 
tion) where we perform detailed comparisons to different 
theoretical scenarios. 

7.3. Transformation of Massive Galaxies at z = 2 — 3 
Into Present-Day E and SOs 

Next we discuss the transformation of massive galaxies 
at z = 2 — 3 into their more massive present-day descen- 
dants, which are primarily E and SOs. During this trans- 
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formation, the massive galaxies will need to significantly 
increase n since the majority (~ 65%) of massive galaxies 
at 2; = 2 — 3 have low n < 2, while the corresponding frac- 
tion among massive systems at 2: ~ is five times lower 
(Table [1] and Figure [5j). Similarly, the galaxies will also 
need to significantly raise rg, since approximately 40% of 
massive galaxies at z = 2 — 3 are in the form of ultra- 
compact (re < 2 kpc) galaxies compared to less than 1% 
at z ~ (Table [1] and Figure [5]). In general, the massive 
z = 2 — 3 galaxies must experience a substantial growth 
in Te by up to a factor of ~ 6, a dimming in rest-frame 
optical surface brightness within rg by up to 6 magnitudes 
(Figure [7j), and their n must increase to n > 2. An in- 
crease in (n, Tg) and a dimming in /ig can be achieved via 
several pathways. 

A natural pathway to produce large changes in (n, rg, 
/ig) is a dry major merger of two disk systems. This pro- 
duces a remnant with n ~ 4, a higher rg, and a lower 
surface brightness within rg than the progenitors (Naab, 
Khochfar, & Burkert 2006; Naab & Trujillo 2006; Naab et 
al. 2009). In this case, the change in n is produced by the 
transformation of galaxies with disks into systems domi- 
nated by spheroids or bulges. This type of transformation 
must take place from z = 2 — 3toz~0in many of the 
massive galaxies because ~ 65% of them at z = 2 — 3 have 
n < 2, which we argued is indicative of a massive disk 
in many cases ( ^7.ip . In contrast the E/SOs at z ~0 are 
dominated by spheroids or bulges. 

Other lines of evidence support the idea that dry ma- 
jor mergers play a role in making the most massive z ^ 
ellipticals. The most massive local ellipticals are found 
to harbor cores (missing light), which are believed to be 
scoured by binary black holes that form in dry major merg- 
ers (Kormendy et al. 2009). From a study of the tidal fea- 
tures associated with bulge-dominated early-type galax- 
ies, van Dokkum (2005) concludes that today's most lumi- 
nous ellipticals form through mergers of gas-poor, bulge- 
dominated systems. Kriek et al. (2008) focus on massive 
red-sequence galaxies at z ~ 2.3 with little or no ongoing 
star formation, finding that the changes in color and num- 
ber density of galaxies on the high-mass end (M^ > 1 x 10^^ 
Mq) of the red sequence from z ~ 2.3 to the present are 
better explained by a combination of passive evolution and 
red mergers that induce little star formation, rather than 
by passive evolution alone. 

While dry major mergers play a role in the evolution 
of massive galaxies, it remains debated whether they can 
account for the full size and mass evolution of massive 
galaxies. From a theoretical standpoint, the predicted dry 
major merger rate appears to be too low. From simu- 
lations, Khochfar & Silk (2009b) find that only between 
10% - 20% of massive (M^ > 6.3 x 10^° Mq) galaxies 
have had a dry major merger in the last Gyr at any red- 
shift z < 1. Hopkins et al. (2010) find from semi-empirical 
models that the importance of major mergers in bulge for- 
mation scales with galaxy stellar mass. Namely, an L^ 
galaxy with M^ ~ 10^ ^M© at z = will experience only 
one dry major merger at z < 2. Shankar et al. (2010) 
calculate that the frequency of dry mergers increases with 
final stellar mass, and they find that hy z = massive 
(M^ > 10^^ Mq) early- type galaxies undergo on average 
< 1 dry major merger since their formation. 



From an observational standpoint, direct measurements 
of the dry major merger rate at z < 1 are highly uncer- 
tain. Bell et al. (2006) suggest that present-day spheroidal 
galaxies with My < —20.5 on average have undergone any- 
where between 0.5 and 2 dry major mergers since z ~ 0.7. 
The analysis carries large uncertainties as it is based on a 
small number (~ 6) of observed dry major mergers. Sev- 
eral observational studies report that between 16% to 35% 
of massive (M^ > 2.5 x 10^^ ^0) galaxies have undergone 
a major merger since z ~ 0.8 (e.g., Jogee et al. 2009; Lotz 
et al. 2008; Conselice et al. 2009), but it should be noted 
that most of the major mergers in the above studies are 
star-forming systems, and there are very few dry major 
mergers. Robaina et al. (2010) find that galaxies with 
M^ > 1 X 11^^ Mq have undergone, on average, only 0.5 
mergers since z ^ 0.7 involving progenitor galaxies that 
are both more massive than M^ > 5 x 10^^ Mq. Hammer 
et al. (2009) focus on starbursts with disturbed ionized 
gas morphologies and kinematics at z ~ 0.65, and they 
argue based on modeling that ~ 6 Gyr ago 46% of the 
galaxy population was involved in major mergers, most of 
which were gas-rich. Kaviraj et al. (2011) find that the- 
oretically and empirically determined major merger rates 
at z < 1 are too low by factors of a few to account for the 
fraction of disturbed systems they find among morpholog- 
ically classified early- type massive (M^ > 1 x 10^^ ^0) 
galaxies at 0.5 < z < 0.7. They suggest that the over- 
all evolution of massive early type galaxies, particularly 
the low-level star formation activity, may be heavily influ- 
enced by minor merging at late epochs. At higher redshifts 
1 < z < 2, higher major merger rates are reported than 
at z < 1 (e.g., Conselice et al. 2003), but the frequency 
of dry major mergers is claimed to be low (Williams et al. 
2011). 

An alternate pathway that could be at least as impor- 
tant as major mergers consists of consecutive dry minor 
mergers or accretion of externally formed stars such that 
stellar mass is cumulatively added to the outskirts of a 
compact galaxy (e.g., Naab et al. 2009; Feldman et al. 
2010). Naab & Trujillo (2006) show that successive mi- 
nor mergers can, on average, raise the Sersic index of the 
merger remnant about as effectively as major mergers. 
Furthermore, it is claimed from simulations and analyt- 
ical arguments that dry minor mergers produce a much 
larger increase in size (rg) and a larger fall in average stel- 
lar mass densities within rg than do dry major mergers 
(Naab et al. 2009; Bezanson et al. 2009). Shankar et 
al. (2011) find in simulations that massive (M^ > 10^^ 
Mq) z ^ galaxies grow primarily by dry minor mergers, 
especially at 2: < 1. Oser et al. (2010; 2011) use cosmo- 
logical simulations to study 40 individual massive galaxies 
with present-day stellar masses of M^ > 6.3 x 10^^ Mq. 
They find that massive galaxies at z > 2 are dominated 
by "in situ" star formation fueled by in-falling cold gas 
within the galaxy. As cold-mode accretion becomes in- 
efficient at 2: ~ 2, accretion of externally created stars 
(i.e., stellar satellites) dominates at 2: < 2. For galaxies 
of present-day stellar mass M^ > 6.3 x 10^^ Mq, the av- 
erage number- weighted merger mass-ratio is ~ 1:16, while 
the average mass- weighted merger mass-ratio is ~ 1:5. In 
other words, the mass growth since z ^ 2 is dominated by 
minor mergers with a mass ratio of 1:5. The importance 
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of stellar accretion increases with galaxy mass and toward 
lower redshift, and it substantially raises the galaxy stellar 
mass and size. For systems with present-day stellar mass 
M^ > 6.3 X 10^^ M0, a size evolution of up to a factor 
of ~ 5 — 6 occurs from z = 2 to z ^ 0. However, one 
strong caveat of these simulations is that all their massive 
(M^ > 1 X 10^^ Mq) galaxies at z = 2 are ultra-compact 
(re < 2 kpc), while observations (see Fig. [5j) show a large 
fraction of such massive galaxies at 2: = 2 are extended 
{ve = 3 — 10 kpc), with a wide range in star formation 
rate. The increase of size and mass induced by minor 
mergers in these simulations is qualitatively in agreement 
with our results on size evolution for the ultra-compact 
systems and also with the inside-out growth reported by 
van Dokkum et al. (2010) from stacking deep rest-frame 
i?-band images of massive galaxies over the redshift range 
of 0.6 to 2.0. 

However, many questions remain unresolved. While dry 
minor mergers appear to be effective at inducing signifi- 
cant evolution in mass and size from z ~ 2 to z ~ in the 
simulations of Oser et al. (2010; 2011), it is unclear if they 
can really drive the large change in Sersic index n required 
by the observations. Furthermore, these simulations focus 
only on ultra-compact (rg < 2 kpc) galaxies, and are not 
representative of the large dominant population of more 
extended galaxies at z = 2 — 3. Finally, it is not clear 
whether minor mergers can account for the changes in ef- 
fective surface brightness between z = 2 — 3 and 2: ~ 0. We 
will evaluate these issues more thoroughly with a detailed 
comparison to models in a subsequent paper (Jogee et al. 
in preparation). 

8. SUMMARY 

We present a study of the structure, activity, and evo- 
lution of massive galaxies at z = 1 — 3 using deep (5<j lim- 
iting magnitude of H=26.8 AB for an extended source of 
diameter 0^7), high resolution (PSF- 0^3) NIC3/F160W 
images from the GOODS-NICMOS Survey (GNS), along 
with complementary ACS, Spitzer IRAC and MIPS, and 
Chandra X-ray data. One of the strengths of our study 
is that the NIC3/F160W data provide rest-frame optical 
imaging over z = 1 — 3 for one of the largest (166 galaxies 
with M^ > 5 X 10^° Mq and 82 with M^ > 10^^ M©), 
most diverse, and relatively unbiased samples of massive 
galaxies at 2: = 1 — 3 studied to date. Our main results 
are summarized below. 

1. Structure of massive galaxies at rest-frame optical 
wavelengths: We analyze the rest-frame optical struc- 
ture of the massive galaxies by fitting single Sersic profiles 
to the 2D light distribution in the NIC3/F160W images. 
We find that the rest-frame optical structures of the mas- 
sive galaxies are very different at z = 2 — 3 compared to 
z ~ 0, with their Sersic index n and half-light radius rg 
being strikingly offset toward lower values compared to 
z ^ 0. (Table [1] and Figure [5]). Through extensive tests 
and artificial redshifting experiments we conclude that the 
offset in (n, rg) between massive galaxies at z = 2 — 3 
and z ~ is real and not primarily driven by system- 
atic effects related to the fitting techniques instrumen- 
tal effects, or redshift-dependent effects (e.g., cosmological 
surface brightness dimming and the loss of spatial resolu- 
tion). In effect, we find a large population of ultra- compact 



(re < 2 kpc) systems^ as well as a dominant population of 
systems with low n < 2 disky morphologies at z = 2 — 3. 
We further describe these populations below. 

We find that approximately 40% (39.0 ± 5.6% for M^ > 
5 X 10^0 Mq and 39.0 ± 7.6% for M^ > 1 x 10^^ M©) 
of the massive galaxies at z = 2 — 3 are in the form of 
ultra-compact (rg < 2 kpc) galaxies compared to less than 
1% at z ~ (Table [Hand Figure [5j). These ultra-compact 
galaxies are practically unmatched among z ^ massive 
galaxies, and their surface brightness in the rest-frame op- 
tical can be 4-6 magnitudes brighter (Figure [7j). 

Secondly, we find that the majority (64.9% ± 5.4% for 
M^ > 5 X 10^0 Mo, and 58.5% ± 7.7% for M^ > 10^^ M©) 
of massive galaxies at z = 2 — 3 have low n < 2, while the 
corresponding fraction among massive systems at 2: ~ is 
five times lower. Most (~ 72%) of these massive galaxies 
at z = 2 — 3 with low n < 2 have rg > 2 kpc, and there- 
fore complement the ultra-compact galaxies. We further 
explore the meaning of a Sersic index n<2at2; = 2 — 3, 
and present evidence that most of the massive galaxies 
with n < 2 at z = 2 — 3, particularly the extended (Vg > 2 
kpc) ones, likely host a prominent disk^ unlike the major- 
ity of massive galaxies at 2; ~ 0. Our evidence is based on 
rest-frame optical morphologies, ellipticities, artificial red- 
shifting experiments, as well as bulge-to-total ratios from 
bulge-plus-disk decompositions of extended systems. 

2. Star formation rates: We estimate star formation 
rates using IR luminosities (8-1000 /im) derived from the 
Spitzer 24 jam. flux for massive GNS galaxies having a se- 
cure MIPS 24 /im counterpart and a 24 jam. flux exceeding 
the 5(7 detection limit of 30 /iJy. AGN host candidates are 
excluded because the inferred IR luminosities overestimate 
the true star formation rate. 

We find a strong link between galaxy structure and SFR. 
Among the non-AGN massive (M^ > 5x 10^^ ^0) galaxies 
at z = 2 — 3 with SFRir high enough to yield a 5<j (30 /iJy) 
Spitzer 24 jam. detection, the majority (84.6 ±10.0%) have 
low n < 2. While the n < 2 disky systems have a wide 
range of SFRir (53 to 1466 M© yr-^ at z = 2 - 3), they 
include the systems of the highest SFRir at both 2; = 1 — 2 
and z = 2 — 3. In contrast, the massive ultra-compact ob- 
jects at 2: = 2 — 3 are less likely by a factor of ~ 2.2 to have 
SFRir above the detection limit, compared to the whole 
sample of non-AGN massive galaxies. 

3. AGN activity: Using a variety of techniques (X- 
ray properties, IR power-law, and IR-to-optical excess) to 
identify AGN, we find that 49/166 (29.5±3.5%) of the mas- 
sive galaxies at 2; = 1 — 3 are AGN candidates. The AGN 
fraction rises with redshift, increasing from 17.9±6.1% at 
^ = 1 - 1.5 to 40.3 ± 8.8% at z = 2 - 3 (Table g]). 

We find a relationship between host galaxy structure 
and AGN activity that complements the relationship be- 
tween SFR and structure. Among massive galaxies at 
2; = 2 — 3, AGN appear to be found preferentially in galax- 
ies that are not ultra-compact, as evidenced by the fact 
that most (80.6 ± 7.9%) AGN hosts have rg > 2 kpc. In 
fact, at z = 2 — 3, the AGN fraction in ultra-compact 
galaxies is ~ 2.7 times lower than in extended galaxies 
(20.0 ± 16.3% versus 53.2 ± 10.0%). Thus, ultra-compact 
galaxies appear quiescent in terms of both SFR and AGN 
activity. In terms of their Sersic index n, a large fraction 
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(64.6±10.7%) of AGN hosts at z = 2-3 have disky (n < 2) 
morphologies. 

4- Cold gas content: We apply a standard Schmidt- 
Kennicutt law (Kennicutt 1998) to the SFRir of the non- 
AGN host candidates. The high estimated SFRir suggest 
that copious cold gas reservoirs are present. We estimate 
that the average cold gas surface density in non-AGN hosts 
ranges from ~ 136 to ~ 25, 091 Mq pc~^ at z = 1 — 3, with 
a median value of ~ 607 Mq pc~^ (Figure [T5|). The im- 
plied cold gas fraction within the rest-frame optical half- 
light radius ranges from 6.5% to 65.4%, with a mean of 
- 41% at z = 2 - 3 (Figure [15]). The highest gas frac- 
tions at a given redshift are found among the less massive 
galaxies, consistent with downsizing. 

5. Formation of massive galaxies by z = 2 — 3.' The 
massive galaxies at z = 2 — 3 already have an average 
rest-frame optical surface brightness within rg that can be 
up to 3-6 magnitudes brighter than 2: ~ massive galax- 
ies. The associated high stellar mass densities imply that 
massive galaxies at z = 2 — 3 must have formed via rapid, 
highly dissipative events at z > 2. Both gas-rich major 
mergers and gas accretion at z > 2 are viable as their 
associated short dynamical timescales and short gas cool- 
ing times at z > 2 would lead to a rapid buildup of mass. 
However, the large fraction (^ 65%) of massive galaxies at 
z = 2 — 3 with n < 2 and disky morphologies suggest that 
cold-mode accretion at z > 2 must have played an impor- 
tant role in the build-up of massive galaxies by z = 2 — 3, 
since it may be challenging to have such a large fraction of 
of merger remnants with low n < 2 from isolated gas-rich 
major mergers. 

6. Transformation of massive galaxies at z = 2 — 3 into 
present-day E and SOs: In order for massive galaxies at 
z = 2 — 3 to evolve into z ~ massive systems (which 
are primarily E and SOs), they need to radically change 
their rest-frame optical structure and distributions of (n, 
Tg). In particular they need to raise n well above 2, in- 
crease Te by an average factor of 3-4, and dim the average 
rest-frame optical surface brightness. Dry major mergers 
can induce changes in galaxy size, Sersic index, and stel- 
lar surface density, but they may be too rare to account 
for all the needed evolution. Successive dry minor merg- 
ers have been shown to influence galaxy size, Sersic index, 
and stellar surface density in a similar direction. We sug- 
gest the transformation of massive z — 2 — 3 galaxies into 
z ~ galaxies will occur through a combination of dry 
major mergers, minor mergers. We will investigate in the 
relative importance and efficiency of these mechanisms in 
a future paper. 
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Table 1 
Rest-Frame Optical Sersic Index n and re in Massive (M^ > 5 x 10^° M©) Galaxies 

z Morphology n < 2 n > 2 n > 3 

M^ > 5 X 10^0 Mq 



z = 2-3 {N = 77) 


All 


64.9 ±5.44% 


35.1 ±5.44% 


18.2 ±4.40% 


z = l-2{N = 89) 


All 


49.4 ±5.30% 


50.6 ±5.30% 


30.3 ±4.87% 


z^O{N = 385) 


All 


13.0 ±1.71% 


87.0 ±1.71% 


74.3 ±2.23% 




E/SO 


0.8 ±0.45% 


64.9 ±2.43% 


58.7 ±2.51% 




Sabc 


10.4 ±1.56% 


20.8 ±2.07% 


14.8 ±1.81% 




Sd/Irr 


1.82 ±0.68% 


1.30 ±0.58% 


0.78 ±0.45% 






M* > 1 X 10^^ Mq 






2 = 2 - 3 (TV = 41) 


All 


58.5 ±7.69% 


41.5 ±7.69% 


17.1 ±5.88% 


z = l-2{N = Al) 


All 


34.1 ±7.41% 


65.9 ±7.41% 


43.9 ±7.45% 


z^O (A/- = 115) 


All 


10.4 ±2.85% 


89.6 ±2.85% 


80.9 ±3.67% 




E/SO 


1.7 ±1.22% 


72.2 ±4.18% 


67.0 ±4.39% 




Sabc 


6.09 ±2.23% 


13.9 ±3.23% 


12.2 ±3.05% 




Sd/Irr 


2.61 ±1.49% 


3.48 ±1.71% 


1.74 ±1.22% 



z Morphology Te < 2 kpc 2 < rg < 4 kpc rg > 4 kpc 

M^ > 5 X 10^^ Mq 

z = 2-3 {N = 77) Ah 39.0 ±5.56% 42.9 ±5.64% 18.2 ±4.40% 

z = l-2 {N = S9) Ah 24.7 ±4.57% 48.3 ±5.30% 27.0 ±4.70% 

z^O {N = 385) All 0.52 ±0.37% 1.8 ±0.68% 97.7 ±0.77% 

E/SO 0.26 ±0.26% 1.8 ±0.68% 63.6 ±2.45% 

Sabc 0.00 ±0.00% 0.0 ±0.00% 31.2 ±2.36% 

Sd/Irr 0.26 ±0.26% 0.00 ±0.00% 2.86 ±0.85% 







M* > 1 X 10^^ Mq 






2 = 2 - 3 (TV = 41) 


All 


39.0 ±7.62% 


41.5 ±7.69% 


19.5 ±6.19% 


z = l-2(N = 41) 


All 


22.0 ±6.46% 


56.1 ±7.75% 


22.0 ±6.46% 


z~0 (A^ = 115) 


All 


0.87 ±0.87% 


1.74 ±1.22% 


97.39 ±1.49% 




E/SO 


0.00 ±0.00% 


1.7 ±1.22% 


72.2 ±4.18% 




Sabc 


0.00 ±0.00% 


0.00 ±0.00% 


20.0 ±3.73% 




Sd/Irr 


0.87 ±3.73% 


0.00 ±0.00% 


5.22 ±2.07% 



Note. — Rows 1-12: For a given redshift (Column 1), morphology (Column 2), and 
stellar mass range, Columns 3, 4, and 5 list the fraction of galaxies in three separate 
bins of Sersic index n. Rows 13-24: Same as the above except that Columns 3, 4, and 5 
reflect bins of half-light radius rg . 
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Table 2 

Fit of Massive Galaxies to re/re,zr^Q = «(! + zY Over z = - 3 

Sample a(±la) /3(±la) 

(1) (2) (3) 

All n 1.15(0.30) -1.30(0.24) 

n < 2 1.11(0.32) -1.30(0.29) 

n > 2 1.20(0.31) -1.52(0.26) 

Non-AGN hosts with high SFRir^ 1.15(0.33) -1.22(0.30) 

Non-AGN hosts with low SFRir^ 1.67(0.33) -1.67(0.28) 



Note. — ^ Non-AGN hosts with 24 jiia flux above the Spitzer 
5a limit (30 /iJy). ^ Non-AGN hosts with 24 /im flux below the 
Spitzer 5<j limit (30 /iJy). 
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Table 3 
Fraction of Massive (M^ > 5 x 10^° M©) Galaxies With 24 /nu Detections 

z SFR^i^ Fraction with f24^xm > 30/iJy Mean SFR (AGN + non-AGN) Mean SFR (Non-AGN) 

(Mo yr-i) (%) (Mo yr-^) (Mo yr'^) 

(1) (2) (3) (4) (5) 

43.6 ±7.9% 63.8 ± 12.9 44.0 ±7.3 

48.0 ±7.1% 222.8 ± 58.5 128.9 ±37.6 

42.9 ±5.6% 1145.6 ± 274.5 418.8 ±142.9 



Note. — Column 2 estimates the detection limit on SFR given the 5a limit on f^A^im of 30 /iJy. The expected SFRjr 
at 30 /xJy is determined by linear regression of the distribution of /24/^rn, versus SFRjr in each redshift bin. Column 3 lists 
the percentage of massive GNS galaxies with f24i^m > 30/iJy. Column 4 shows the mean SFR among all galax ies h aving 
/24^m. > 30 /iJy. Column 5 shows the mean SFR among all galaxies without any evidence for AGN activity (see ^6.1|) . The 
error bars in Column 4 and Column 5 represent the standard error on the mean. 



1 - 1.5 


4.29 


1.5-2 


12.4 


: 2 -3 


47.2 
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Table 4 
Summary of AGN Detection and Properties 



z Total Number X-ray AGN PLG IR Excess AGN AGN Fraction Median n Median r^ 

(kpc) 
(1) (2) (3) (4) (5) (6) (7) (8) 



z — 


: 1 - 


1.5 


7 


7 








17.9 ±6.1% 


2.12 


4.48 


z = 


: 1.5 


_ 2 


11 


6 





5 


22.0 ±5.9% 


1.85 


3.73 


^-- 


= 2 - 


-3 


31 


20 


3 


11 


40.3 ±8.8% 


1.42 


2.83 
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Table 5 
Summary of Kolmogorov-Smirnov Test on Ellipticity 



Sample 1 

(1) 




Sample 2 

(2) 


Probability 

(%) 
(3) 


KS Test D 
(4) 




n<2 z = 2-3 
n<2 z = 2-3 
n<2 z = 2-3 


MGC 


MGG E/SO 
Spiral (Sabc + Sd/Irr) 
MGG Sd/Irr 



4.78 
23.5 


0.489 
0.221 
0.317 


n>2 z = 2-3 
n>2 z = 2-3 
n>2 z = 2-3 


MGC 


MGG E/SO 
Spiral (Sabc + Sd/Irr) 
MGG Sd/Irr 


34.3 

14.0 
15.8 


0.184 
0.237 
0.370 



Note. — Golumns 1 and 2 list the two samples for which ellipticity was 
compared in each KS test. Golumn 3 lists the probability that Sample 1 and 
Sample 2 are drawn from the same distribution. Column 4 lists the Kolmogorov- 
Smirnov statistic specifying the maximum separation between the cumulative 
ellipticity distribution functions for Sample 1 and Sample 2. 
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Appendices 

A. PSF MODELING 

Knowledge of the PSF is important to assess data quality and for deriving structural parameters. NIC3 is out of focus, 
so the PSF can deviate from the theoretically expected one. PSF convolution with GALFIT is commonly performed with 
a user-provided bright, unsaturated star. Not all of the GNS tiles contain suitably bright, unsaturated stars. It is not 
advisable to adopt a set of PSF stars from a subset of pointings because the NIC3 PSF depends on position within the 
NIC3 field and is also subject to interpolation artifacts introduced by drizzle that are dependent on the adopted dither 
pattern (J. Krist, private communication). 

As a result, the best available option for handling PSF convolution is to make synthetic NIC3 PSFs with Tiny Tim 
(Krist 1995). For each galaxy. Tiny Tim PSFs were generated for all the galaxy's positions in the individual, undrizzled 
exposures. Telescope breathing was accounted for with each PSF by refining the Tiny Tim parameters to match the 
Pupil Alignment Mechanism (PAM) value recorded in the headers of the undrizzled frames. Blank, zero-valued frames 
retaining the WCS information of the undrizzled frames were made. The synthetic PSFs were inserted into the blank 
frames precisely where each galaxy would be in the individual frames. The blank frames were drizzled together in the 
same way as the data with a pixfrac of 0.7 and a final output platescale of 0^1 /pixel. This process was repeated for all 
166 massive (M^ > 5 x 10^^) galaxies in our sample. 

This approach accounts both for variation in PSF with position on the NIC3 field and for the dependence on the 
drizzle algorithm. The range of FWHM in the final drizzled synthetic PSFs is ~ 0^26 — 0^3613 with a mean value of 
0^3. The mean PSF diameter of the science images (0^3) is 2.5 kpc at z = 2, under the adopted cosmology. 

B. EXTRA TESTS ON SYSTEMATIC EFFECTS 

B.l. Tests on Robustness of Fits and Parameter Coupling 

How robust are the results that a dominant fraction of the massive galaxies at z = 2 — 3 have a low n < 2 and that 
a large fraction are ultra-compact? In particular, how non-degenerate are the fits? Could some of the galaxies with an 
n < 2 have similarly good fits with higher n? 

First, one should note that the errors quoted by GALFIT on the structural parameters cannot be used to assess the 
robustness of the fits because the errors quoted by GALFIT underestimate the true parameter errors (Haussler et al. 2007), 
which are dominated by the systematics of galaxy structure, and in particular, by parameter coupling and degeneracy. 

The task of assessing the coupling between model parameters is complicated when models have a large number of free 
parameters. The single Sersic profiles fit to the NICMOS galaxy images have 7 free parameters (centroid, luminosity, rg, 
n, axis ratio, and position angle). While GALFIT selects a best-fit by minimizing x^ foi" ^i given set of input guesses, 
it is not clear whether the minimized x^ is an absolute minimum or local minimum. Investigating the x^ values for 
all combinations of fit parameters over the full multi-dimensional parameter space is prohibitively time consuming and 
computationally expensive. Instead, we will adopt a simpler approach of focusing on strong coupling between rg and n, 
and exploring how x^ varies as these parameters are moved away from the initial solution picked by GALFIT. 

One important point should be noted when using changes in x^^ oi" ^X^? fo^ fits to different models. When errors 
are normally distributed, the multi-dimensional ellipsoids for a given Ax^ contour can be associated with a statistical 
confidence level (e.g., Ax^ ~ 1 corresponds to a 68% confidence level). However, since the errors in the GALFIT models 
are not normally distributed, but are instead dominated by the systematics of galaxy structure, this means that we cannot 
a priori assign a confidence level to a given Ax^. As outlined in the test below, we can still use the shape of Ax^ as a 
function of n or rg as a guide to the quality of fit in the sense that sharp rises in x^ as n is varied away from the best-fit 
value are taken as indicative of poorer fits. But, we cannot a priori say how much poorer the fits are in a statistical sense. 
This is a well-known and hard problem in structural fitting. We will return to this point in section ^B.2[ 

We carry out the test below for all galaxies in our sample We denote as Xmin c ^^^ value of x^ obtained when GALFIT 
fits the galaxy with n and Tg as free parameters. The associated best-fit parameters are nmin,o and re,min,o- We then fit 
single Sersic profiles with n fixed at discrete values (0.5-10), while allowing all other parameters to freely vary. The initial 
inputs to these fits were the same as those used to generate the model in which n is a free parameter. We let GALFIT 
find the best-fit for each of these fixed n models by minimizing x^? and we record for each such best-fit the following 
quantities: the fixed value of n, the best-fit value of rg, and the associated minimum in x^ called Xmin- We then evaluate 
how the difference Xmin " Xmin o varies as a function of rg and n, as we move to values away from nmin,o and re,min,o- 

The test was carried out for all galaxies. Figure IBll shows the results of the test for four representative galaxies with 
n ~ 1 — 4. The first column of Figure [BD shows how (Xmin"Xmin o) changes when n is varied away from nmin,o at discrete 
values (0.5-10) and GALFIT is allowed to vary all other parameters to get a best-fit that yields Xmin- ^^^ second column 
shows the corresponding best-fit rg for that Xmin- -^^^ stars in the plots denote nmin,o and re,min,07 which are associated 
with Xmin 0- "^^^ shape of Xmin " Xmin ^^ asymmctric for n and Tg. The coupling between n and rg means (Xmin " Xmin o) 
varies in a similar way with both n and rg. 

We can see that in Figure \Bl\ the absolute minimum x^ values occur at the nmin,o and re,min,o values, which GALFIT 
picked when it was allowed to freely fit the galaxies without fixing n. Shifting n away from nmin,o (denoted by the red 
stars) by ±1 can increase Xmin ^y several 10s or 100s of x^ units. While only 4 representative galaxies are shown in Figure 

■^^ The range in PSF FWHM comes from differing positions in the NIC3 field and the PAM values used to create the synthetic PSFs. 



26 

IBl) we show results for the whole sample in Figure IB2I This figure illustrates that the distribution of (Xmin " Xmin o) 
for nmin,o — 1 (top panel) and nmin,o + 1 (bottom panel), and demonstrates for the whole sample, Xmin generally changes 
substantially when n is shifted away from nmin,o- We draw two primary conclusions from Figure [Bll 

1. For galaxies with nmin,o > 2 (rows 3 and 4), Xmin ~ Xmin o ^i^es sharply at lower n < nmin,07 suggesting that lower 
n values are unlikely to yield a good fit for such systems. At n > nmin,07 Xmin " Xmin o i"ises less sharply, but the 
rise is still substantial as demonstrated in by the high-magnification inset plots in rows 3 and 4 of column 1. 



2. The most important point to take from Figure [BT] is that for galaxies with nmin,o < 2 (as in rows 1 and 2), Xmin " 
Xmin ^^^^^ rapidly at higher n > nmin,07 thereby making it unlikely that a higher n > 2 would provide a similarly 
good fit. Thus, we have a great degree of confidence that we are not highly overestimating the number of n < 2 
galaxies in the sample. 

B.2. Recovery of Parameters From Simulated Images 

Section TO. II tests how well parameters are recovered in real galaxies, but we cannot a priori assign a confidence level 
to a given Ax^ because the errors in the GALFIT models are not normally distributed. However, we can run an extra 
complementary test where we use simulated idealized galaxies whose (n, rg) are a priori known. The drawback of using 
idealized galaxies as opposed to the real galaxies fitted in g3.1l is that the former lack the complexity of real galaxies, since 
they are simply generated from GALFIT models and exactly described by a functional form, such as a Sersic model with 
a specified (n, rg). However, the advantage is that we do know the (n, rg) values a priori and can therefore compare these 
values to those obtained once these idealized galaxies are inserted into frames with noise properties corresponding to the 
NIC3 GNS images of our sample galaxies at z = 1 — 3. 

This test is performed by simulating 1000 galaxy images, each with a unique set of Sersic parameters: surface brightness 
at the effective radius /ig, effective radius rg, Sersic index n, axis ratio b/a, and position angle PA. The parameters are 
chosen randomly from uniform distributions spanning the parameter space of the observed galaxies. The ranges in /ig, rg, 
n, b/a, and PA are 16 to 32 mag/arcsec^, 0V05 to IVO, 0.5 to 10, 0.3 to 1.0, and -90 to 90 degrees, respectively. The chosen 
range in input /ig mimics the effect of surface brightness dimming, and the range in r^ ensures the simulated objects 
span the angular size of the real GNS galaxies. The simulated galaxies were created with GALFIT and convolved with a 
drizzled PSF image. They were set within a sky background equivalent to the mean NIC3 sky background within GNS 
(0.1 counts/sec). Source noise, sky noise, and read noise (29 e~) were added to the frames. 

The simulated images were then re-fit with GALFIT to derive (n, rg). Initial guess parameters for (/ig, rg, n, b/a, PA) 
were generated randomly from uniform distributions spanning ±1.5 mag/arcsec^ in /ig, ±0^3 in rg, ±2 indices in n, 0.3 
to 1 in b/a, and -90 to 90 degrees in PA. Figure [B3l shows the recovery in (n, rg) plotted against surface brightness. The 
dashed vertical lines represent the minimum, median, and maximum /ig for the observed massive galaxies. Figure IB3I 
shows (n, rg) are well recovered across the full range in observed /ig. The recovery as a function of /ig severely degrades 
only at several mag/arcsec^ fainter than observed /ig. In ~ 95% of cases, n and rg are recovered to within 10% of their 
input values for the range of observed /ig among the massive galaxies in our sample. 

B.3. Tests on MGC Fits 

The structural parameters for the massive galaxies at z ~ 0, are derived by Allen et al. (2006) by using the GIM2D code 
(Simard et al. 2002) to fit single Sersic component to the MGC 5-band images. We derived the structural parameters 
for the massive galaxies at z = 1 - 3, by using the GALFIT code (Peng et al. 2002) on the NIC3/F610W images f ^XT]) . 
One might wonder whether the dramatic shift in Figure [5] of the z = 2 — 3 galaxies toward lower (n, rg) compared to the 
the z ~ MGC galaxies may be caused by systematic differences between the fitting techniques used by us versus those 
by Allen et al. (2006). This would be the case only if the fits by Allen et al. (2006) give systematically higher (n, rg) 
than ours for the same galaxies. As we show below this is not the case. 

In order to address this issue, we have applied GALFIT to a subset of 5-band MGC images and compared our resulting 
structural parameters to the GIM2D-based results given in the MGC catalogue. The comparison (top row of Figure IB4p 
shows that the GIM2D-based fits of Allen et al. (2006) are not biased to higher (n, rg) compared to our GALFIT-based 
fits for the 2: ~ MGC galaxies. In fact, for large rg, the GIM2D-based values may even be lower, in many cases. 

These results are consistent with extensive comparisons of single component Sersic fits from GALFIT and GIM2D 
conducted by Haussler et al.(2007) on both simulated and real galaxy data. They concluded that both codes provide 
reliable fits with little systematic error for galaxies with effective surface brightnesses brighter than that of the sky, as 
long as one is not dealing with highly crowded fields. 

Another possible source of difference between the structural parameters of the 2: ~ and 2: = 2 — 3 massive galaxies 
might be the fact that Allen et al. (2006) fitted the z ~ massive galaxies with only a single Sersic component, and 
did not include an extra point source component in galaxies with evident nuclear sources. It seems unlikely that the 
much larger fraction of higher (n, rg) systems at z ~ in Figure [5] is mainly driven by this effect. To illustrate this, we 
have fitted the z ~ MGC galaxies in the top row of Figure IB4I with a combination of a single Sersic component and a 
point source model using GALFIT. The bottom row of Figure IB4I shows the results. The values of rg are not changed 
systematically. The Sersic index is lowered by the addition of the point source, but only 20% of the sources with n > 2 in 
the single Sersic fit have n < 2 after including the point source. Since not all 2; ~ MGC galaxies in Figure [5] will have 
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nuclear sources, the fraction of sources impacted will be even less. We thus conclude that the presence of a point source 
in some of the 2: ~ MGC galaxies and the inclusion of such a point source in the model fits, are not sufficient to shift 
the z ^ MGC galaxies into the parameter space occupied by the z = 2 — 3 massive galaxies in Figure [5l 
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Fig. 1. — The distribution of apparent H (F160W), V apparent magnitude, stellar mass, and redshift for the final, complete sample of 166 
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Fig. 2. — We compare the galaxy stellar mass functions for GNS over z = 1 — 3 with those from other studies that are based on K or 
IRAC-selected samples (Kajisawa et al. 2009; Marchesini et al. 2009; Perez-Gonzalez et al. 2008; Fontana et al. 2006). The vertical line 
in each plot marks the mass cut (M^ > 5 x lO-*^^ -^0) for the GNS-based sample used in this paper. We include the data points with error 
bars from the other studies, where available, along with each Schechter function fit. Some studies (Kajisawa et al. 2009; Marchesini et al. 
2009) present results for multiple sets of SED-modeling assumptions, and in these cases we show the results for the assumptions that most 
closely match those used for GNS by Conselice et al. (2011). For Kajisawa et al. (2009) , we show the mass function calculated with Bruzual 
& Chariot (2003) stellar templates. For Marchesini et al. (2009), we show the stellar mass functions calculated with Bruzual &; Chariot 
2003 templates, metallicities of 0.2, 1, and 2.5 Zq, a Kroupa IMF, and a Calzetti extinction law, but in the above plot, we scale their mass 
functions by +0.2 dex along the x-axis to convert their Kroupa IMF to a Salpeter IMF. For the GNS mass functions, in comparison, the best 
metallicity is determined on a galaxy-by-galaxy basis from a set of discrete values spanning 0.005 to 2.5 Zq. The error bars for Marchesini et 
al. (2009) take into account the uncertainties due to cosmic variance, Poisson error, photometric redshifts, and stellar SED templates. The 
error bars from Perez-Gonzalez et al. (2008) account for Poisson error and uncertainty in photometric redshifts. In comparison, the error 
bars on the GNS mass functions show only Poisson error. 
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Fig. 3. — For all galaxies detected in the GOODS-NICMOS Survey (GNS) over 2; = 1 - 3, the rest-frame U -V color is plotted against 
M^ for different redshift bins. Blue systems are preferentially at low masses, while the most massive (M^ > 1 x lO"'^-'^ ^0) galaxies are 
preferentially red. The vertical line denotes M^ = 5 x lO-*^*-" Mq, the mass cut we adopt for our final sample of 166 galaxies. 
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Fig. 5. — The B-band Sersic index n and effective radius re derived from single Sersic profile fits to massive (M^ > 5 x 10-'^*-' Mq) galaxies 
are plotted for the three redshift bins listed in Table [T] In the top row, the black points represent fits to z ~ galaxies by Allen et al. (2006) 
on S-band images of galaxies from the Millennium Galaxy Catalog (Liske et al. 2003). The lower two rows are based on our fits to the NIC3 
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Fig. 6. — Left column: The distributions of rest-frame optical Sersic index and effective radius re based on single Sersic profile fits to 
massive (M^ > 5 x 10-'^'-' Mq) galaxies are plotted for the three redshift bins listed in Table [T] at z ~ (solid line), based on the fits of Allen 
et al. (2006) on S-band images of galaxies from the Millennium Galaxy Catalog (Liske et al. 2003), and at z = 1 — 2 (dash-dotted line) and 
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Fig. 7. — Left column: The top panel shows mean extinction-corrected rest-frame 5-band surface brightness within the effective radius 
(< fie >) for massive (M^ > 5 x IQ-*^^ -^o) galaxies for the three redshift bins listed in Tabled The solid line is for 2; ~ MGC galaxies. The 
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Fig. 8. — Top row: The black points show the massive (M^ > 5 x lO-*^*-" ^0) 2; ~ galaxies from MGC described earlier in Figure [5] The 
magenta points denote the SDSS-based sample SI of 255 representative massive (M^ > 5 x 10-'^° ^0) z ^ galaxies used in the redshifting 
experiment. Note the (n, re) distribution of SI covers the same parameter space as that of the MGC sample. This is also shown quantitatively 
in Figure [9] Row 2: We show as blue squares the (n, re) distribution obtained after redshifting SI to 2; = 2.5 and 're-observing' it with 
NIC3/F16W as in the GNS survey. We assume a surface brightness evolution of 2.5 magnitudes and brighten each redshifted galaxy by this 
amount. The actual observed (n, re) distributions of the massive galaxies at z = 2 — 3 in the GNS survey are significantly offset toward 
lower values compared to the redshifted galaxies. The black dashed line represents the typical half-width half max of the NICMOS3 PSF at 
2 = 1 - 3 of - 1.2 kpc. 
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Fig. 9. — This figure illustrates the same information as in Figure [8] but in more quantitative terms. It shows the n and re distributions for 
the full MGC sample of massive z ~ galaxies (black line) and the representative sample SI of 255 galaxies used in the redshifting experiment 
(magenta line). Sample SI does a good job of matching the full MGC sample and is typically within ±10% for a given bin. We also contrast 
the (n, Te) values after redshfiting SI to z = 2.5 (blue line) with the actual distribution observed in the massive the GNS galaxies at z = 2 — 3 
(red line). While 64.9 ± 5.4% and 39.0 ± 5.6% of the massive z = 2 — 3 galaxies have n < 2 and re < 2 kpc, respectively, the corresponding 
fractions for the redshifted sample are 10.6 ± 1.9% and 1.2 ± 0.7%. The results shown here are for galaxies with M^ > 5 x lO"*^*-* Mq, but a 
similar result is obtained for M^ > 1 x lO-*^-*^ -^0- 
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Fig. 10. — Left column: The panels compare the rest-frame optical structural parameters (Sersic index n and effective radius re) of massive 
(M^ > 5 X IQ-*^^ Mq) elliptical and SO galaxies at z ~ to the structural parameters recovered after these galaxies were artificially redshifted 
to z = 2.5, brightened by 2.5 magnitudes in surface brightness, and re-observed with NIC3/F160W. At 2; ~ 0, the structural parameters 
were measured from ^f-band images, while at 2; = 2.5 they are measured from the artificially redshifted images in the NIC3/F160W band, so 
that all parameters are measured in the rest-frame blue optical light. The black lines represent equality, while the shaded area represents 
the regime of n < 2 and re < 2 kpc, where 64.9 ± 5.4% and 39.0 ± 5.6%, respectively, of massive GNS galaxies at z = 2 — 3 lie (Table [T] 
and Figure [8]). The plots show that the Sersic index n and effective radius re of the massive z r^ E and SOs may he lower or higher after 
redshifting out to z = 2.5^, but they do not, in general, drop to values as low as n < 2 and re < 2 kpc, and avoid the shaded area. Right 
column: Same as left column, but this time for massive (M^ > 5 x 10-*^*-* Mq) 2; ~ spiral galaxies. The galaxies are coded by bulge-to-total 
light ratio (B/T). B/T was measured with bulge-disk and bulge-disk-bar decomposition of the z ~ ^f-band images. The top plot shows that 
it is mainly massive 2; ~ late-type spirals of low B/T that yield Sersic index n as low as n < 2 after redshifting, and populate the shaded 
area where 64.9 ± 5.4% of massive GNS galaxies at z = 2 — 3 lie. However, as shown by this lower plot, the local massive spirals have much 
larger re (re ^ 2 kpc) and after artificial redshifting avoid the shaded area where 39.0 ± 5.6% of the massive GNS galaxies at z = 2 — 3 lie. 
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Fig. 11. — Top left: The /24/xm distribution for the massive (M^ > 5 x 10^° Mq) GNS galaxies with reliable MIPS 24 fim counterpart. 
Upper right: The inferred Ljfi distribution over 8-1000 //m. Lower left: The inferred SFRir distribution based on L/j^, which is estimated 
using the Chary & Elbaz (2001) templates, with a correction at Ljji > 6 x lO"*^-*^ Lq. Lower right: SFRir versus M^. For sources containing 
an AGN, the measured Ljji and SFRir are upper limits. The upper right and bottom panels use different coding for sources identified in J6] 
as hosting an AGN. 
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corresponding SFR-mass correlation at z ^ 2. The right-hand panel shows mean SFRjr in the different redshift bins for sources with SFRjr 
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Fig. 14. — Same as Figure [T3] but now the data are sorted by half-light radius re- Note that only a small fraction of the ultra-compact 
(re < 2 kpc) galaxies have SFRij^ above the 5cr detection limit. Some ultra-compact galaxies have high SFRir, but, on average, their mean 
SFRiK are lower than in more extended systems. 
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Fig. 15. — Left column: For galaxies with SFRir above the 5(j detection limit, the distributions of cold gas surface density (Egas), cold 
gas mass Mgas, and cold gas fraction (/gas(''c)) within the circularized optical half-light radius re are shown for different redshift ranges. 
Sgas is calculated using a Schmidt-Kennicutt law (Kennicutt 1998) with power-law index 1.4 a normalization factor of 2.5 x 10""*. The cold 
gas fraction (/gas(''c) = Afgas/(Mgas -|- Mi,)) is calculated relative to the total baryonic mass within re. Right column: Same as left column 
except that only non-AGN sources are shown. 
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Fig. 16. — Top: For galaxies with SFRjr above the 5a detection hmit, the mean cold gas fraction (/gas(^c) = Mgas/(Mgas + M^)) within 
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Fig. 17. — The upper and lower-left panels show single Sersic index n versus effective radius re for the 49 AGN candidates selected either 
based on X-ray properties, mid-IR power-law, or IR-to-optical excess. The lower-right panel shows the median Sersic index and Ve in each 
redshift bin. 
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Fig. 18. — In the top panels, the deconvolved ellipticity (1 - b/a) measured by GALFIT is shown for massive (M* > 5 x 10^° Mq) GNS 
galaxies at z = 2 — 3 with n < 2 and n > 2. The bottom panels show the deconvolved ellipticity for similarly massive E/SO and Spiral galaxies 
as measured with GIM2D by Allen et al. (2006). 
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Fig. B1. — For four representative galaxies with n 
respectively. Xmin o ^^ ^^^ minimum x^ obtained when all parameters are freely fit, and Xmin 
values (0.5-10). 
r, 



e,min,o Corresponding to Xmin O" ^^^ insets in rows 3 and 4 of column 1 show a magnified view around the minimum in Xmin ~ /^min o* ^^^^ 
that for galaxies with nmin,o < 2 (rows 1 and 2), Xmin ~ /^min o ^^^^^ sharply at higher n > nmin,05 thereby making it unlikely that a higher 
n > 2 would provide a similarly good fit. 
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Fig. B2. — The quantity x^j^ ~ Xmin o from Figure [bT] is shown for all massive GNS galaxies well fitted with a single Sersic profile. The 
top panel evaluates Xmin ~ Xmin o ^^ ^min,o — I7 and the bottom panel evaluates x^^^ ~ Xmin ^^ ^min,o + 1, where nmin,o is the best-fit 
Sersic index corresponding to Xmin O" 
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Fig. B3. — For the simulations described in ^B.2I the difference between input and output Sersic index n and effective radius re are plotted 
against effective surface brightness /ie, the surface brightness at re. The vertical lines correspond to the range in /ie in the NIC3/F160W 
band for the massive galaxies at z = 1 — 3 in our sample. 
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Fig. B4. — Top row: We demonstrate for a subset of z ~ galaxies in the MGC catalog that the GIM2D-based (n, re) values from Allen 
et al. (2006) are not biased to higher values compared to our GALFIT-based fits for the same galaxies. All fits are performed on the B-band 
images from MGC. Bottom row: We show the effects of adding a point source in the GALFIT models fitted to the z ~ MGC galaxies. The 
values obtained using a model made of a Sersic component plus a point source are plotted along the y-axis, while the x-axis shows the values 
obtained with a single Sersic component. The values of re are not changed systematically. The Sersic index is lowered by the addition of the 
point source, but only 20% of sources with n > 2 in the single Sersic fit have n < 2 after including the point source. 



